Free Tool
Robots.txt rule builder
Build a valid robots.txt file with our visual editor. Add user agents, configure allow and disallow rules, include sitemap URLs, and set crawl-delay directives – then copy or download the finished file.
Rule editor
Sitemap URLs
Crawl-delay (optional)
Seconds between requests. Not all crawlers honour this directive. Googlebot ignores it – use Google Search Console instead.
Live preview
User-agent: * Disallow: /admin/ Allow: / Sitemap: https://example.com/sitemap.xml
Robots.txt best practices
The robots.txt file sits at the root of your domain and tells search engine crawlers which parts of your site they can and cannot access. It is one of the first files a crawler requests, so getting it right is essential for healthy indexation.
What does robots.txt do?
A robots.txt file provides crawl directives to web robots (also called spiders or bots). It follows the Robots Exclusion Protocol – an informal standard that all major search engines honour. Each rule group targets a specific user agent and contains allow or disallow directives for URL paths.
Manage crawl budget efficiently
Prevent bots from wasting time on low-value pages such as admin panels, internal search results, and staging areas so they spend more time on the pages that matter.
Keep private areas out of search results
Disallow paths like /admin/, /account/, and /checkout/ to prevent sensitive or duplicate pages from appearing in search engine indexes.
Point crawlers to your sitemap
Including a Sitemap: directive in your robots.txt helps crawlers discover your XML sitemap more quickly, which can accelerate indexation of new and updated pages.
Control AI crawler access
Newer user agents like GPTBot, ChatGPT-User, and Google-Extended can be specifically blocked or allowed, giving you control over how your content is used for AI training and retrieval.
Common robots.txt mistakes
Blocking CSS and JavaScript
Disallowing /css/ or /js/ prevents Google from rendering your pages properly, which can harm rankings. Always allow access to static assets.
Using robots.txt instead of noindex
Disallowing a URL does not remove it from Google's index – it only prevents crawling. If a page is already indexed, use a noindex meta tag instead.
Accidentally blocking the entire site
A single Disallow: / under User-agent: * with no corresponding Allow rules will block every crawler from your entire site. Always double-check your rules before deploying.
Placing the file in the wrong location
The robots.txt file must be served at https://yourdomain.com/robots.txt – the exact root of the domain. Placing it in a subdirectory or serving it from a different subdomain will have no effect.
Need help with your technical SEO?
A well-configured robots.txt is just one piece of the puzzle. Our Liverpool-based specialists can audit your entire technical SEO setup and make sure crawlers are seeing exactly what you want them to. Get a free audit to find your biggest opportunities.