Sphinn Home » SEO
A new patent application on near duplicate content from Google explores using a combination of document similarity techniques to keep searchers from finding redundant content in search results.
3 Comments     

Comments

from billslawski 195 days ago #
Votes: 2 | Vote:
+ -

Thank you for the sphinn, Kimmie.   I liked this patent application from Google a lot because it provided so many citations to other documents about duplicate and near duplicate content, and provided some good research about those other approaches.

from iamlost 194 days ago #
Votes: 0 | Vote:
+ -

This issue - and the SEs' solutions - will become increasingly critical. As duplicate filtering succeeds, and only one instance of that content is ever returned, determining/crediting the original takes on greater value importance. Unfortunately, the SEs do not much weight actual authorship unless hit over the head with a DMCA cudgel.

My auto-response 'Thank You' to Bill for continuing to bring us the future of search as described by patents - in plain language. As always (a couple of years == always in web time, right?) seobythesea sits atop my reading list.

from billslawski 194 days ago #
Votes: 0 | Vote:
+ -

You're welcome, iamlost.

Like many of the papers/patents about duplicates and near duplicates, this patent application doesn't talk about how a search engine might distinquish between an original author and someone copying content. 

The only patent that I can recall from Google that did address that area was their patent on Agent Rank, which relied upon a digital signature like that used in Open ID to sign content, blog posts, blog comments, and other content.


Log in to comment or register here.
Search Marketing Expo

Save the date for:
SMX China (Nanjing) - Sept. 23-24
SMX Stockholm - Sept. 23-24: See who's speaking or register now.
SMX East (New York City) - Oct. 6-8: See the agenda or register today and save!
SMX London - Nov. 4-5: Pre-agenda rate now available. Click here.