The DFN Lounge
With
Pasta
Robots.txt
What Is It?
As Internet popularity began to rise in the early nineties. Search engines were thirsty information to include in their databases. They sent out spiders/robots to retrieve info from websites by spidering directory trees, following page links ingesting metadata to create their search indexes. Their thirst also took its toll on system resources, sending out requests, the requests in turn would load many pages in succession. Webmasters and Bot coders quickly adopted the robots exclusion protocol as a way to control the cataloging/indexing process.
Robots.txt is a plain ascii text file that gives direction to the SE robots which pages, directories are ALLOWED to be traveled. Something important to keep in mind is ROOT PRIVILEGES if you are not root, you don't get/have access to robots.txt control. What I mean is if you have a virtual hosting account you are NOT root You don't have admin privileges therefore no robots.txt. All is not lost if you are not root. You can use a robots meta tag for similar more localized control of SE indexing of your site. I have done a lot of reading, I haven't personally utilized a robots.txt file on the roots of my domains. I don't know what effects it would have on a virtual account.



