- 52
- Sphinn It!
Posted By: tomcritchlow 121 days ago
Topic Type: News Story (Jump to http://www.seomoz.org)
Category: Google SEO
13 Comments
13 Comments
Save the date for:
SMX London - Nov. 4-5: Pre-agenda rate now available. Click here.
SMX West - Feb. 10-12
Learn more about search marketing through free online webcasts and webinars from our sister site Search Marketing Now.
Comments
Looks like the problem might be even more widespread than just .0 - some .1 URLs not working either... Stay tuned to the comments!
See also: http://sphinn.com/story/52509
Thanks Ian. They're related stories but this is the one with the Google clarification plus a pretty extensive investigation.
Arn't we all lucky we can all benefit from Rands direct line to Google :)
@tamar: Sure, just joining the two, so that it still makes sense a year from now...
I find it curious that .0 urls get penalized while urls ending in other numbers get indexed. Like tomcritchlow, I also wonder where the indication of spam came from.
This is a strange one, but I am sure that Google will address it now that it has been noticed.
Seems .0 pages are still indexed with trailing slash at the end.
http://www.unicode.org/versions/Unicode4.0.0/
http://creativecommons.org/licenses/by-sa/3.0/
This conventional wisdom is already outdated after just a few hours. I just did a post to help people understand this issue: http://www.mattcutts.com/blog/dont-end-your-urls-with-exe/
It appears that URLs ending with .zip aren't indexed either.
Glad this situation got cleared up. The only thing I'd like to ask is that we not call it a "penalty" as it is referred to in the SEOmoz post. There's way too much confusion and obfuscation in that arena as it is. "Indexing behaviour" might be a better term.
@jimbeetle you beat me to it! I was just coming here to say the same thing. (Have been behind on my reading the past few weeks.)
It does a disservice to everyone to call this a penalty rather than what it actually is -- a filter.
Other than that, great find and thanks to SEOmoz for sharing.
"It does a disservice to everyone to call this a penalty rather than what it actually is -- a filter."
Jill, I wouldn't even call it a filter since I think Google does this at the crawl level. When I coded PageRankBot I had to check file extensions (e.g. WMV, jpeg, and a slew of others) to prevent the scraper bot from retrieving files containing large amount of data (for example, WMV files might run up to 300MB or more). By preventing the tool from crawling those files I kept it from "stalling." It's basically just a method of crawler resource optimization, nothing more.