Sorry this site requires JavaScript to be enabled in your browser. See the following guide on How to enable JavaScript in Internet Explorer, Netscape, Firefox and Safari. Alternatively you may be blocking JavaScript with an advert-related or developer plugin. Please check your browser plugins.

Forgotten –and, until recently, ignored– not standardized statements in your robots.txt might change Googlebot’s behavior all of a sudden, without notice.

I don’t know for sure which experimental crawler directives Google has implemented yet, but for example a line like
Noindex: /
in your robots.txt will now deindex your complete Web site.

Better check your robots.txt and make sure it doesn’t contain crawler directives belonging to robots meta tags respectively X-Robots-Tags.
Comments8 Comments  

Comments

Avatar
from ronwicker 2586 Days ago #
Votes: 2

You have great, thought out posts and i’m a fan of the swearing and brutal honesty.

Avatar Moderator
from Sebastian 2586 Days ago #
Votes: 0

Thanks for the compliment, Ron.

Avatar
from g1smd 2585 Days ago #
Votes: 0

Great article.  For this entry:Noindex: /repstuff/noindex.phpDid this URL get indexed:   example.com/repstuff/ at all?I would have picked a non-index-page URL for the test just in case Google indexed the bare URL without  filename anyway.

Avatar Moderator
from Sebastian 2585 Days ago #
Votes: 0

Thanks. :) Google can’t index example.com/repstuff/ because there’s no default document and direcory browsing is forbidden.

Avatar
from g1smd 2585 Days ago #
Votes: 0

OK. I follow that case.

Avatar
from Kalena 2582 Days ago #
Votes: 0

Hey Seb - great post. Did you see it got blogged by Dave at WPN? Here http://www.webpronews.com/insiderreports/2007/11/21/unvalidated-robots-txt-risks-google-banishment

Avatar Moderator
from Sebastian 2582 Days ago #
Votes: 0

Thanks :)

Avatar Moderator
from Sebastian 2578 Days ago #
Votes: 0

Here are the first <a href="http://sebastians-pamphlets.com/validate-your-robots-txt-or-google-might-deindex-your-site/#robots-txt-test-results-2007-11-28">test results</a>.It seems Google indeed treats Noindex: in robots.txt as Disallow:, if that is so that’s a bad move. I hope they’ll do the right thing eventually. Noindex: shouldn’t block crawling, because it implies Follow:

Upcoming Conferences

Search Marketing ExpoSearch Engine Land produces SMX, the Search Marketing Expo conference series. SMX events deliver the most comprehensive educational and networking experiences - whether you're just starting in search marketing or you're a seasoned expert.



Join us at an upcoming SMX event: