Sorry this site requires JavaScript to be enabled in your browser. See the following guide on How to enable JavaScript in Internet Explorer, Netscape, Firefox and Safari. Alternatively you may be blocking JavaScript with an advert-related or developer plugin. Please check your browser plugins.

A Google patent granted this week explores how a search engine might look at queries that contain stopwords or stop-phrases, and determine whether or not the stopword or stop-phrase is meaningful enough to include in search results shown to a searcher.
Comments8 Comments  

Comments

Avatar
from billslawski 2233 Days ago #
Votes: 3

Thanks for sphinning this, David.The thing I found most interesting about this patent was its description of different ways to compare search results from alternative versions of the queries, with and without stopwords, to see if those results were substantially similar.  We know that Google stopped telling us about stopwords that it found in queries many months ago, yet it’s possible that a similarity analysis like that discribed in the patent may still be used in some instances.

Avatar
from NickWilsdon 2233 Days ago #
Votes: 0

Thanks for the patent update Bill, much appreciated.

Avatar
from JohnHGohde 2233 Days ago #
Votes: -2

How does Google’s ignoring STOP words in their searches reconcile with Google giving ’the’ and ’in’ authority site status?

Avatar
from ColinCochrane 2233 Days ago #
Votes: 0

Good find.  This is where the really fascinating part of search is, because one of the bigger hurdles to providing relevant results is dealing with the ambiguities of human language.  Stuff like this gets right at the core of one of the biggest problems of computing: inferring semantics from syntax.

Avatar
from billslawski 2233 Days ago #
Votes: 0

Hi John,I did try to answer your question on my blog since you posted it there too, and I saw it there first, but I don’t know if my answer might be what you’re looking for.Does your question have to do with the fact that searches for those words by themselves (the,in) show results that include sitelinks for the first results?  It’s odd that they do.  Does the presence of site links make those sites authoritative for queries for those terms?  A search for "at" also gives us site links for AT&T.It’s an interesting question, worth exploring further.Does it make any difference that we are only searching for a single word, regardless of whether or not it might have been considered a stop word or not, in the past? 

Avatar
from billslawski 2233 Days ago #
Votes: 0

Thanks, Nick and ColinI agree with you about this kind of stuff being fascinating, Colin.  How do you infer context and intent from a couple of words? 

Avatar
from JohnHGohde 2233 Days ago #
Votes: -2

@billslawskiHi William,"Does your question have to do with the fact that searches for those words by themselves (the,in) show results that include sitelinks for the first results?"Yes. The #1 hit when searching on ’the’ turns up an authority site where the word actually means ’the.’ Whereas, in searching for ’in’ what turns up as #1 is the abbreviation for a state, in the United States.SEOs talk about authority site status rather than Google. Google just refers to sitelinks on which they have a patent.Authority site status, of which there are several versions, is basically a special form of a double or indented listing. All of which are desirable to have in the SERPs since it is widely believed that they attract the attention of searchers.Yes, the fact that I was searching for only one word might make a difference. But, since sitelinks and therefore authority sites are dynamically generated and if ’the’ is a STOP word then why is Google dynamically generating sitelinks for it?

Avatar
from billslawski 2232 Days ago #
Votes: 0

Hi John,I tend to shy away from overusing a term to describe things if it might be confusing or misleading.  A lot of people have used the term "authority" to describe web pages in many ways.  What I think about when I hear the phrase is Jon Kleinberg’s Authoritative pages - http://www.cs.cornell.edu/home/kleinber/auth.pdf  Using the phrase "authority" to describe a site that shows up in search results with site links in response to a specific query may cloud things, as does using the term to describe a site that has an additional indented result in response to a query.  While I’m not sure that Google would have gone through the whole "substantially similar" comparison analysis that the stopwords patent  describes on a search for a single stopword and no other terms, I’m not sure that a search for "the" or "in" providing us with a top result with site links tells us anything about the stopwords process described in the patent, or the newer process with an enhanced compression/decompression approach that makes phrase searching more viable.I think looking closer at the site links process might yield more reasons for those terms to show specific pages with site links then considering how Google might be treating stop words or phrase matches.  If the idea behind site links is to provide a better user experience in navigational queries by providing easier access to deeper pages in the first result, why would Google’s algorithm decide to show site links for "The Onion" web site on a search for "the?"  Since it’s the generation of site links that you’re concerned about, in response to those query terms, that may be the first level of inquiry - why does that process get triggered for those words. 

Upcoming Conferences

Search Marketing ExpoSearch Engine Land produces SMX, the Search Marketing Expo conference series. SMX events deliver the most comprehensive educational and networking experiences - whether you're just starting in search marketing or you're a seasoned expert.



Join us at an upcoming SMX event: