Free Tool

Robots.txt rule builder

Build a valid robots.txt file with our visual editor. Add user agents, configure allow and disallow rules, include sitemap URLs, and set crawl-delay directives – then copy or download the finished file.

Rule editor

User-agent

Sitemap URLs

Crawl-delay (optional)

Seconds between requests. Not all crawlers honour this directive. Googlebot ignores it – use Google Search Console instead.

Live preview

robots.txt

User-agent: *
Disallow: /admin/
Allow: /

Sitemap: https://example.com/sitemap.xml

Robots.txt best practices

The robots.txt file sits at the root of your domain and tells search engine crawlers which parts of your site they can and cannot access. It is one of the first files a crawler requests, so getting it right is essential for healthy indexation.

What does robots.txt do?

A robots.txt file provides crawl directives to web robots (also called spiders or bots). It follows the Robots Exclusion Protocol – an informal standard that all major search engines honour. Each rule group targets a specific user agent and contains allow or disallow directives for URL paths.

Manage crawl budget efficiently

Prevent bots from wasting time on low-value pages such as admin panels, internal search results, and staging areas so they spend more time on the pages that matter.

Keep private areas out of search results

Disallow paths like /admin/, /account/, and /checkout/ to prevent sensitive or duplicate pages from appearing in search engine indexes.

Point crawlers to your sitemap

Including a Sitemap: directive in your robots.txt helps crawlers discover your XML sitemap more quickly, which can accelerate indexation of new and updated pages.

Control AI crawler access

Newer user agents like GPTBot, ChatGPT-User, and Google-Extended can be specifically blocked or allowed, giving you control over how your content is used for AI training and retrieval.

Common robots.txt mistakes

Blocking CSS and JavaScript

Disallowing /css/ or /js/ prevents Google from rendering your pages properly, which can harm rankings. Always allow access to static assets.

Using robots.txt instead of noindex

Disallowing a URL does not remove it from Google's index – it only prevents crawling. If a page is already indexed, use a noindex meta tag instead.

Accidentally blocking the entire site

A single Disallow: / under User-agent: * with no corresponding Allow rules will block every crawler from your entire site. Always double-check your rules before deploying.

Placing the file in the wrong location

The robots.txt file must be served at https://yourdomain.com/robots.txt – the exact root of the domain. Placing it in a subdirectory or serving it from a different subdomain will have no effect.

Need help with your technical SEO?

A well-configured robots.txt is just one piece of the puzzle. Our Liverpool-based specialists can audit your entire technical SEO setup and make sure crawlers are seeing exactly what you want them to. Get a free audit to find your biggest opportunities.

Get a Free Audit Technical SEO Services