Robots.txt

by

Pasta

Pasta's Pissers

What Is It
As Internet popularity began to rise in the eary nineties . Seach engines were thirsty information to include in their databases. They sent out spiders/robots to retrieve info from websites by spidering directory trees,following page links injesting metadata to create their search indexes. Their thirst also took its toll on system resources, sending out requests,the requests in turn would load many pages in sucession. Webmasters and Bot coders quickly adopted the robots exclusion protocal as a way to control the cataloging/indexing process.

Robots.txt is a plain ascii text file that gives direction to the SE robots which pages,directories are ALLOWED to be traveled. Something important to keep in mind is ROOT PRIVILEGES if you are not root,you dont get/have access to robots.txt control. What I mean is if you have a virtual hosting account you are NOT root You dont have admin privileges therefore no robots.txt. All is not lost if you are not root. You can use a robots meta tag for similar more localized control of SE indexing of your site. I have done alot of reading, I havent personally utilized a robots.txt file on the roots of my domains. I dont know what effects it would have on a virtual account.

Pasta's Cool DFN Link Of The Week

How You Doin? The Power of 3

"Undertake something that is difficult; it will do you good. Unless you try to do something beyond what you have already mastered, you will never grow."
Ronald E. Osborn

File Structure

Create a simple text file called robots.txt. A few examples of a robots.txt file:

User-agent: *
Disallow: /
This file would disallow all robots from all your pages.

User-agent: *
Disallow: /personal/
This file would disallow all robots from the "private" directory

User-agent: Hotbot
Disallow:
This file would allow Hotbot only. Disallowing all other robots.

If you want to experiment with robots.txt since its a simple file it will be prone to mistakes :) So a good idea would be to check the files syntax here
Robots.txt Syntax Check

Another cool link which has many uses, one of many webrobot databases
Web Robots Database

Here is a free tool to help avoid syntax errors
Robots.txt Generator

Hasta Pasta


©2001 VNWR. All rights reserved.