Published: Sep 06, 2008 - 07:25 am
Story Found By: stephan 1358 Days ago
Category: SEO
11 Comments
11 Comments
Search Engine Land produces SMX, the Search Marketing Expo conference series. SMX events deliver the most comprehensive educational and networking experiences - whether you're just starting in search marketing or you're a seasoned expert.
Join us at an upcoming SMX event:
Learn more about search marketing with our free online webcasts and webinars from our sister site, Search Marketing Now. Upcoming online events include:
Comments
had this problem previously &, interestingly, once got in touch with you, Stephan, to ask what you thought!After no-following, our traffic went down, but we kept it nofollowed anyway as our servers had been totally hammered by all the attribute-page crawling. Interestingly - though traffic reduced - revenue did not &, as a result, conversion % bumped up.
Great article. This issue seems to be related to Googles pronouncements a while back (in late 06? early 07?) that it didnt want to continue to index site search results in general, and sites allowing their site search results pages to be spidered might see some sort of bad rankings outcome; whether that would amount to a rankings hit to the non-search pages or just a hit to the search pages, I dont recall. My understanding, however, is that nothing ever came of that, i.e., we *didnt *see huge numbers of webmasters complaining that their sites had been penalized due to their indexed search results. (If Im wrong about that, someone here will set me straight!)So hows this scenario? A client has a B2C site as well as a third-party site search solution with attribute-based navigation built in. The B2C sites web host automatically provides a sitemap.xml file just for the site pages, so Googles getting ongoing sitemap data just for the "real" pages, not for the site search pages. Absent implementing a robots.text or nofollow attribute scheme for the site search pages, could the sitemap-only-for-site-pages impementation help by making sure the "money" site pages with order buttons are indexed, leaving the indexing of the site search pages more at the whim of Google?
I agree with Googles mantra "build pages for users, not for search engines". I find it annoying when Im searching, and keep getting old site search pages appearing in SERPs, meaning I never actually find the article Im looking for. If it annoys me, its going to annoy customers.I took the step of applying meta noindex, follow tags on my site search result pages, reminding myself that Im helping the user experience, by telling search engines to only include useful pages.Interestingly, traffic really wasnt affected by this. However, the real improvement was in conversion. Give people what they are looking for, and theyll take it.By default now, I make my site search result pages noindex, follow.
I agree if the results page is 100% identical that it can be problematic but what if your results pages are a more unique cross-section of the same catalog?Why wouldnt I want my customers to go directly to a very specific attribute page if its an often sought after attribute?I dont think there is any 100% answer to this problem, but moderation in which attribute searches get indexed would probably be advised.
"A combination of nofollows, meta noindexing and disallows strategies should be employed for this."You dont want to combine disallow and meta noindex. If you disallow a URL, the META doesnt get read.
What about using these duplicate pages to support the main categories. By programmatically altering the title-tags, descriptions and perhaps even some texts, these pages can look quite different. They can in turn gain some ranking weight and support the most important pages.The idea is not to battle this duplicate content problem, but to use it.
@IncrediBILL, That is a good point, which is why I always recommend starting with nofollows for the low value (from an SEO perspective) attributes first, like the price range breakdowns, since few folks are searching for "car stereos $0 to $50". Then you can experiment with more aggressive nofollowing.@Halfdeck, Thanks for mentioning that. We prefer meta robots noindex over robots.txt disallows because the robots.txt disallow doesnt completely remove the listing from the SERPs, just leaves a snippetless, titleless listing in the SERPs that can still rank for keywords in anchor text pointing to that disallowed page. And in order for the meta tag to be read and obeyed by Google, the page must be allowed by robots.txt.@monchito, Thats the crux of it, that the pages need to "look quite different". If the page content isnt significantly paraphrased/rewritten, then there will be too many shingles in common with the original page and it gets picked up as duplicate content.
I wonder what a slightly altered (e.g. titles, descriptions, short texts, maybe some extra links to the most important page and a different HTML template) page in combination with meta noindex,follow does in term of passing on internal pagerank
Just checking back re my previous question. So nobody has any insight into whether putting only the "real pages" in a sitemap.xml file would help "weight" them as more relevant vs. the search result pages in Googles eyes?
@winooski - Google cant tell the difference between one type of page or the other, a page is a page. However, if both types of pages return the exact same results, then it might notice. Other than that, as I have this exact situation, as long as the results are unique either type of page should rank just fine.
Still makes sense to keep the different URLs, though.Large Luxury wallets for menLarge Luxury wallets for womenI wouldnt want either to sift through all the wallets. Maybe there should be a balance, or we need to make sure the faceted pages actually are optimized (i.e., obviously unique and persuasive to the people).