Published: Feb 13, 2008 - 01:55 pm
Story Found By: MattMcGee 1563 Days ago
Category: SEM
5 Comments
5 Comments
Search Engine Land produces SMX, the Search Marketing Expo conference series. SMX events deliver the most comprehensive educational and networking experiences - whether you're just starting in search marketing or you're a seasoned expert.
Join us at an upcoming SMX event:
Learn more about search marketing with our free online webcasts and webinars from our sister site, Search Marketing Now. Upcoming online events include:
Comments
lol dude this is crazy man. I have always thought that the robots.txt file should actually be located in any location much like the xml sitemap. Just add it to the webmaster account.They always seem so revealing. Your clients webmaster should know to use sub domains for testing and never to link from a "live" source to the testing area. Bots only know about pages because of the links. Some do some fancy search queries and so forth.
Ive been warning people about the exposure of robots.txt for years purely from a website security point of view and solving this is not rocket science.The solution is called a DYNAMIC ROBOTS.TXT file and you serve up the REAL robots.txt file only to valid whitelisted bots from major search engines verified by IP range and/or full trip DNS validation.Everyone else sees the following:User-agent: *Disallow: /Which will confuse the hell out of them, which is a good thing if theyre snooping!
Thanks Bill. That sure confused me once when I saw it on someones website - snooping as I was. Nice to know the reason! Now can we game it ;-)
Its quite interesting how major websites cloak this, and some dont seem to care. A simple, allow the majors leave the small engines out seems to be a good tactic to me. This is cloaking at its simplest, but one of the few times it wont be frowned upon by the search engines ;)
@incredibill If someone links to the robots.txt file from another site, then the robots.txt file will appear in the SERPs, and the Google cache of the file will show you what Google really saw... unless some extra steps are taken by the site owner (most dont).