Simplehost Web Hosting

Robots.txt what is it and what does it do?

robots.txt is a plain text file placed in the root (top most folder) of your website. It is used to restrict what search engine robots aka crawlers should index on your site. You should only use it if you want certain pages NOT to be indexed and displayed as search results, i.e. on google.com

The contents of the file must conform to strict guidelines but it is still rather simple to understand. It may or may not exist on your site, it is not required and completely optional. If you do decide to create one, place it in the root folder, and edit with a plain text editor such as notepad on windows or textedit on a mac. Simply place the exact content at the top of the file and nothing else.

An example of disallowing any search engine from crawling your site, which will prevent your site from showing on search results:
 
User-agent: *
Disallow: /

 
Or… to disallow just a folder and its contents:
User-agent: *
Disallow: /cgi-bin/

 
To disallow a single page:
User-agent: *
Disallow: /secret.html


And finally multiple pages and folders:
User-agent: *
Disallow: /images/
Disallow: /secret.html
Disallow: /secret2.html


We recommend you use robots.txt carefully to avoid accidentally blocking your site from search engines and evidently users not being able to find your site on the Internet.

Recent News
Feb 13, 2016

We have deployed PHP 7.0 to selected servers which has up to 4x performance gain on previous versions as well as the most comprehensive language API available. Read More


Feb 08, 2015

Bitcoin payments are now accepted for all hosting products. Bitcoin payments are simple, fast and simple to do on a mobile device. Get your digital wallet today and pay with Bitcoin. Read More