Published: Jun 13, 2008 - 06:58 am
Story Found By: tomcritchlow 1441 Days ago
Category: SEO
13 Comments
13 Comments
Search Engine Land produces SMX, the Search Marketing Expo conference series. SMX events deliver the most comprehensive educational and networking experiences - whether you're just starting in search marketing or you're a seasoned expert.
Join us at an upcoming SMX event:
Learn more about search marketing with our free online webcasts and webinars from our sister site, Search Marketing Now. Upcoming online events include:
Comments
Looks like the problem might be even more widespread than just .0 - some .1 URLs not working either... Stay tuned to the comments!
See also: http://sphinn.com/story/52509
Thanks Ian. Theyre related stories but this is the one with the Google clarification plus a pretty extensive investigation.
Arnt we all lucky we can all benefit from Rands direct line to Google :)
@tamar: Sure, just joining the two, so that it still makes sense a year from now...
I find it curious that .0 urls get penalized while urls ending in other numbers get indexed. Like tomcritchlow, I also wonder where the indication of spam came from.
This is a strange one, but I am sure that Google will address it now that it has been noticed.
Seems .0 pages are still indexed with trailing slash at the end.http://www.unicode.org/versions/Unicode4.0.0/http://creativecommons.org/licenses/by-sa/3.0/
This conventional wisdom is already outdated after just a few hours. I just did a post to help people understand this issue: http://www.mattcutts.com/blog/dont-end-your-urls-with-exe/
It appears that URLs ending with .zip arent indexed either.
Glad this situation got cleared up. The only thing Id like to ask is that we not call it a "penalty" as it is referred to in the SEOmoz post. Theres way too much confusion and obfuscation in that arena as it is. "Indexing behaviour" might be a better term.
@jimbeetle you beat me to it! I was just coming here to say the same thing. (Have been behind on my reading the past few weeks.)It does a disservice to everyone to call this a penalty rather than what it actually is -- a filter.Other than that, great find and thanks to SEOmoz for sharing.
"It does a disservice to everyone to call this a penalty rather than what it actually is -- a filter."Jill, I wouldnt even call it a filter since I think Google does this at the crawl level. When I coded PageRankBot I had to check file extensions (e.g. WMV, jpeg, and a slew of others) to prevent the scraper bot from retrieving files containing large amount of data (for example, WMV files might run up to 300MB or more). By preventing the tool from crawling those files I kept it from "stalling." Its basically just a method of crawler resource optimization, nothing more.