- 41
- Sphinn It!
Posted By: MattMcGee 162 days ago
Topic Type: News Story (Jump to http://www.jenniferslegg.com)
Category: Other Search Marketing
5 Comments
5 Comments
Save the date for:
SMX Local & Mobile - San Francisco, CA (July 24-25) See the agenda, and register now!
SMX Sao Paolo - Brazil - (Aug. 7-8)
SMX China - September 23 & 24, 2008
SMX Stockholm - September 23 & 24, 2008
SMX East - NYC - (Oct. 6-8) Registration is now open.
SMX London - November 4 & 5, 2008
Comments
lol dude this is crazy man. I have always thought that the robots.txt file should actually be located in any location much like the xml sitemap. Just add it to the webmaster account.
They always seem so revealing. Your client's webmaster should know to use sub domains for testing and never to link from a "live" source to the testing area. Bots only know about pages because of the links. Some do some fancy search queries and so forth.
I've been warning people about the exposure of robots.txt for years purely from a website security point of view and solving this is not rocket science.
The solution is called a DYNAMIC ROBOTS.TXT file and you serve up the REAL robots.txt file only to valid whitelisted bots from major search engines verified by IP range and/or full trip DNS validation.
Everyone else sees the following:
User-agent: *
Disallow: /
Which will confuse the hell out of them, which is a good thing if they're snooping!
Thanks Bill. That sure confused me once when I saw it on someone's website - snooping as I was. Nice to know the reason! Now can we game it ;-)
Its quite interesting how major websites cloak this, and some don't seem to care. A simple, allow the major's leave the small engines out seems to be a good tactic to me. This is cloaking at its simplest, but one of the few times it won't be frowned upon by the search engines ;)
@incredibill If someone links to the robots.txt file from another site, then the robots.txt file will appear in the SERPs, and the Google cache of the file will show you what Google really saw... unless some extra steps are taken by the site owner (most don't).