Outing a few academic sites. Quite annoying to click a promising PDF link on a SERP when the result is a signup page asking for pounds or dollars. Technically spoken, there are better ways to serve different PDFs to lurkers/crawlers and paying members.
12 Comments
12 Comments




Comments
The publishers are part of the Google Scholar program. And yeah, its cloaking. But its not new -- nor is it hidden from Google. Its done with Googles full cooperation. I wrote about this back in 2004: http://blog.searchenginewatch.com/blog/041201-063855 As Ive covered before in the situations of cloaking allowed at Google for NPR and Google Scholar, this type of cloaking is helpful to searchers. Its good cloaking. I have to stress, Dodds and the other Google Scholar participants are doing nothing wrong. They are working directly with Google, with Googles full approval, in a way that Google rightly feels will help searchers. Nevertheless, Googles failure to update its policy continues to make it sound hypocritical. Telling general web publishers not to cloak, then having your Google Scholar participants talk about "sleight-of-hand" is a mixed message.
At least theyve updated the policy regarding subscriber-only content in Google News (http://www.google.com/support/news_pub/bin/answer.py?answer=40543&topic=11707)
A few things: 1. If they are part of the G Scholar program, they can stay there. Keep them out of the main index where they actually cause more annoyance than help searchers. 2. Google should label that the content may not be freely accessible to everyone. Pubmed (a National Institutes of Health-backed service) has two icons to symbolize freely accessible papers and pay-for papers. It works really well. 3. A lot of publisher sites (Nature, Blackwell Synergy, the American Microbiology Association, Amercian Chemical Society, PNAS, Science Mag) rank really well using just the abstracts and additionally, any PDF contents are freely accessible to anyone. If Google says its found a PDF, I can get it. THAT is useful to searchers. There two other angles to this: 1. These publishers gave away their content to Google for free but world+dog have to pay. Their content (they can do as they please), but Google should not index it. 2. There are now two tiers of publishers in the eyes of Google, and as a scientist, let me tell you that the ones that are not indexed because they dont cloak (Science, Nature) are much larger, have higher impact, and more prestigious than what the cloakers. I wont go into consipiracy theories about why this is an interesting observation. So its not just about updating the policy. The whole set up stinks and is wrong in the eyes of the users. Pierre (of blogSci.com)
Thanks for the correction Danny :) Annoying searchers with Googles permission doesnt make it less annoying in the main index.
I agree with Pierre and am for a level playing field, at least for the main web search index. Google News Archives tag paid content; I like that, you know what you are going to get, no false expectations. I dont know about Google Scholar, but I hope they tag paid content as well. This kind of cloaking just doesnt belong in the main web search index, in my opinion. If they want, they could just have their abstract pages indexed like everyone else with paid content. If WMW were to get into Google Scholar, would they also be allowed cloak content to the engines and redirect users to signup-pages?
WMW, as a news site, might already be able to do that. Pierre, I totally agree with the points you make. I voiced many of them myself back when they started this integration. Whats likely happening is that they are increasing the content in the system, which as you say, is annoying. Of course, if you are on a university campus, they do some other things to make the scholarly material show without a PDF, I believe. This had some info on it: http://newsbreaks.infotoday.com/nbreader.asp?ArticleID=16204 At this point, they probably should get Google Scholar out of mixing with web results and make it more like what Yahoo does with subscription content: http://www.ysearchblog.com/archives/000117.html
Academic institutions tend to subscribe to these publishers sites. The authentication is automatic via IP address block checking. The institutions of course pay, and so when they click through from a PDF from the SERPs, they get the PDF in question. The point here is to reinforce that to a small demographic on the web a straight clickthrough is useful. (Subscribers to science journals are a tiny population compared to the webs population.) Thats why I say contain this content in G Scholar and label it properly. I would also open it up to all other publishing houses - Google really ought to contact the main ones (theyre only a handful) and tell them about this set-up. The content will always increase as the academic publishing machine is stronger than ever. BTW, John, the original blogger I linked to just posted a comment. He makes some good points from the point of view of an academic. Pierre
Regardless of the kind of publisher and the current rules: Should subscription / paid content be included in the normal web search results (not just the abstract pages, but the full content crawled and indexed, but not viewable for users)? What if it were tagged and usable as a filter like in Google News Archives?
By the way, this: YADAC: Yet Another Debate About Cloaking Happens Again http://searchengineland.com/070304-231603.php Is from me in March and recaps the entire history of cloaking debates, especially as weve had things like Google Scholar or NPR and others get to do "special" things with Google.
YADACHA
YADAC is right, but why do we have to debate it YA? Why does it take that much debating before it gets taken seriously be the webmasters / site owners and the search engines? Whats wrong with only getting the sell-page with the abstract indexed? It could be a matter of a spam-report, getting that handled and done. (heh, I wonder if anyone did that :-))
Hey all I wrote up a summary of the discussions from the past few days on my blog: http://blogsci.com/randoms/summary-of-academic-publishers-cloaking-discussion I think its the correct assessment. Comments welcome of course :) Pierre