The robots.txt file is a good way to prevent a page or site from getting indexed. However, not every site can use it. The only robots.txt file that the spiders will read is the one at the top html directory of your server. This means you can only use it if you run your own domain. The spiders will look for the file in a location similar to this below:
http://www.mysite.com/robots.txt Create a file called robots.txt and place in there what areas you want to protect.
If you want to exclude all the search engine spiders from your entire domain, you would write just the following into the robots.txt file:
User-agent: *
Disallow: / If you want to exclude all the spiders from a certain directory within your site, you would write the following:
User-agent: *
Disallow: /yourgallery/ If you want to do this for multiple directories, you add on more Disallow lines:
User-agent: *
Disallow: /yourgallery/ If you want to exclude certain files, then type in the rest of the path to the files you want to exclude:
User-agent: *
Disallow: /yourgallery/index.php
Disallow: /directory/example.htmIf you want to keep a specific search engine spider from indexing your site, do this:
(Note for Google make this Google & make a seperate one for Googlebot )
User-agent: Robot_Name
Disallow: /
This is what I use for certain sections and it actually works for me.
To see more on how to allow go to:
http://www.searchtools.com/robots/robots-txt.html