Sorry this site requires JavaScript to be enabled in your browser. See the following guide on How to enable JavaScript in Internet Explorer, Netscape, Firefox and Safari. Alternatively you may be blocking JavaScript with an advert-related or developer plugin. Please check your browser plugins.

For the short term, it appears that Google has had the last laugh. Read this wacky index switcheroo case study from CapeCodSEO.
Comments6 Comments  

Comments

Avatar
from g1smd 2429 Days ago #
Votes: 0

The post you blocked. Does it have more than one URL that can access it?  Did a different URL for it get indexed?Did you block it in robots.txt a day or two before publishing it?  If not, Google can find a page and index it, and only later, perhaps next day, get the newer version of robots.txt that was supposed to block it, and then take a day or two to drop it from the results. That happens a lot.Check the Google cache date, and then check your site logs for the date and time that Google fetched the  robots.txt  file.

Avatar
from dedmond29 2429 Days ago #
Votes: 0

g1smd - thanks for taking the time to comment and make suggestions.  Its completely appreciated.  I am pretty certain I uploaded the new robots file in coordination with the post - and have since simply removed the block.I also tweaked a couple other things and we’ll see what happens.  Your comments did get me thinking about checking Google’s index of the site in greater detail and it does appear that I need to spend a bit more time on this.  Interestingly enough, more pages have now been removed (since this morning - including the sitemap page), although Google Blog Search has indexed all posts as appropriate.  The funny thing is - I don’t even receive a substantial amount of Google traffic (now have I ever) but can’t really NOT rank this blog for a select group of keywords. Sigh.Thanks again for the time.

Avatar
from g1smd 2429 Days ago #
Votes: 0

Google doesn’t always fetch the robots.txt file before fetching new content, and even when they do fetch it they can take several days to act on it, and even longer to deindex stuff they have already listed. Nothing is instantaneous, and stuff that you thought secret can be easily leaked by doing things in the wrong order.  Different search engies also do things in slightly different ways too.  There’s no easy answer.  :-)

Avatar
from MattCutts 2429 Days ago #
Votes: 1

g1smd, that’s what I was thinking too. Any time I hear "I blocked Google from one post, and then the root page disappeared" it often turns out to be related. I did a three-second check to verify that the webspam team had nothing to do with it.

Avatar
from dedmond29 2429 Days ago #
Votes: 0

@MattCutts, g1smd - thanks for the replies and the quick check.  I didn’t really think there was/is some form of conspiracy going on here - but it was fun to write about (sorry).  I also did not consider the relationship between search engine crawl rates, robot.txt files and timing.  That is a really interesting point of reference and something to pay attention to in the future.  Thanks again!

Avatar
from g1smd 2428 Days ago #
Votes: 0

I have learnt to upload a new robots.txt about a week before the URLs that are needed to be blocked go live.

Upcoming Conferences

Search Marketing ExpoSearch Engine Land produces SMX, the Search Marketing Expo conference series. SMX events deliver the most comprehensive educational and networking experiences - whether you're just starting in search marketing or you're a seasoned expert.



Join us at an upcoming SMX event: