After my earlier message, I got several inquries as to what a
robots.txt file is. I thought I would provide a quick overview for
everyone.
A robots.txt file is a way to keep search engines FROM indexing a page
or a site. Normally this is the opposite of what you want to do, but
sometimes you have reasons that you don't want a page to show up in
Google or Yahoo (such as a file called, say, "emailaddresses.html" ;)
).
Creating this file is simple. First, open up notepad (or another text
editor). One the first line, type the following line:
User-agent: *
This tells ALL search engines to pay attention to the file.
Now, lets go back to the example of "emailaddresses.html". If I want
to keep this file from being "indexed" by the search engines, I would
type this:
Disallow: emailaddresses.html
You can also disallow entire directories:
Disallow: /images/
So, if I wanted to keep the search engines from indexing the
"emailaddresses.html" and the /images/ directory on my site, the
entire thing would look like this:
User-agent: *
Disallow: emailaddresses.html
Disallow: /images/
I would save this file with the name "robots.txt" (IT MUST BE NAMED
THAT!) and upload it to the "root" web directory (the "public_html"
directory if you have one).
Thats it... simple enough to do.
Jason