Step 8 - Creating a robots.txt File

Most websites don't need a robots. txt file. That's because Google can usually find and index all of the important pages on your site. And they'll automatically NOT index pages that aren't important or duplicate versions of other pages.

What does the robots.txt do?

A robots. txt file tells search engine crawlers which pages or files the crawler can or can't request from your site. This is used mainly to avoid overloading your site with requests; it is not a mechanism for keeping a web page out of Google.

The sitemap for this site is at https://thewebtoolbox.com/robots.txt.

What does a robots.txt file look like?

The snippet below shows the robots.txt file contents for this site up to this page.

The robots.txt file
User-agent: *
Disallow: /admin/
Disallow: /admin/dashboard/
Disallow: /forgot/
Allow: /code/
Allow: /contact/
Allow: /info/
Allow: /register/
Allow: /utilities/
Allow: /visitors/
Sitemap: https://thewebtoolbox.com/sitemap.xml

How do I create a robots.txt file?

  • Any regular text editor can be used to create the file as long as it is saved as PLAIN TEXT (No Bold or any styling of any kind).
  • Save the file as robots.txt ALL LOWER CASE.
  • Make sure that you don't allow and disallow the same folder.
  • Remember that in the header of each page you will have added a meta robots instruction to index and follow or noindex and nofollow. See The Header
  • Make sure that it is saved in the root directory of your site.
  • It is good practice to include the location of your sitemap.