Robots.txt

Definition

Robots.txt is a simple text file placed in a website's root directory that provides instructions to search engine crawlers about which areas of the site should not be processed or scanned. It's part of the Robots Exclusion Protocol (REP), a group of web standards that regulate how robots crawl the web.

Google Search Console includes a robots.txt Tester tool, which allows you to check if your robots.txt file is correctly configured and test how specific URLs are affected by it. Proper use of robots.txt can help you manage your site's crawl budget and prevent sensitive or duplicate content from being indexed.

Definition

Related Terms

Crawling

Indexing

Sitemap