- 41
- Sphinn It!
Posted By: MattMcGee 508 days ago
Topic Type: News Story (Jump to http://www.jenniferslegg.com)
Category: Other Search Marketing
5 Comments
5 Comments
Save the date for:
SMX Singapore - July 2-3, 2009
SMX São Paulo - August 4-5
SMX East - October 5-7, 2009
SMX Stockholm - 12-13 October, 2009
SMX Mexico - November 11, 2009
Learn more about search marketing through free online webcasts and webinars from our sister site Search Marketing Now.
Comments
lol dude this is crazy man. I have always thought that the robots.txt file should actually be located in any location much like the xml sitemap. Just add it to the webmaster account.
They always seem so revealing. Your client's webmaster should know to use sub domains for testing and never to link from a "live" source to the testing area. Bots only know about pages because of the links. Some do some fancy search queries and so forth.
I've been warning people about the exposure of robots.txt for years purely from a website security point of view and solving this is not rocket science.
The solution is called a DYNAMIC ROBOTS.TXT file and you serve up the REAL robots.txt file only to valid whitelisted bots from major search engines verified by IP range and/or full trip DNS validation.
Everyone else sees the following:
User-agent: *
Disallow: /
Which will confuse the hell out of them, which is a good thing if they're snooping!
Thanks Bill. That sure confused me once when I saw it on someone's website - snooping as I was. Nice to know the reason! Now can we game it ;-)
Its quite interesting how major websites cloak this, and some don't seem to care. A simple, allow the major's leave the small engines out seems to be a good tactic to me. This is cloaking at its simplest, but one of the few times it won't be frowned upon by the search engines ;)
@incredibill If someone links to the robots.txt file from another site, then the robots.txt file will appear in the SERPs, and the Google cache of the file will show you what Google really saw... unless some extra steps are taken by the site owner (most don't).