Story Found By: tappingcreativity 1533 Days ago
Category: SEO
5 Comments
5 Comments
Search Engine Land produces SMX, the Search Marketing Expo conference series. SMX events deliver the most comprehensive educational and networking experiences - whether you're just starting in search marketing or you're a seasoned expert.
Join us at an upcoming SMX event:
Learn more about search marketing with our free online webcasts and webinars from our sister site, Search Marketing Now. Upcoming online events include:
Comments
If you dont know what LSI really is but you dont mind pretending you do then this is the kind of article you might write. Statements such as "Google, in fact, implemented LSI into its algorithm a few years ago and has continued to use it since", "The best way to discover these semantic relationships is to perform a search of Google with the tilde (~) character in front of your query" and "This is especially true because Google uses LSI to evaluate the relevancy of your websites link profile" are just plain wrong.LSI is a complex mathematical operation and I agree it is difficult to understand but lets not fuel the myth that search engines use LSI. To do so simply encourages the snake oil salesman to peddle their "LSI software" and other "LSI optimized" products and services to unsuspecting punters.I have tried to redress the balance with a laymans explanation of LSI http://www.seo-blog.com/latent-semantic-indexing-lsi-explained.php and a follow up post on the LSI Myth http://www.seo-blog.com/latent-semantic-index-lsi-myth.php
Thanks for the follow up, duz.
You might even leave a comment like this if youre a pompous and arrogant SEO clown who thinks he knows far more than he does but is generally confused as to the reality of the situation. If you would have done even five minutes of research before polluting the forum with your "snake-oil" diatribes youd have discovered that Google does use a derivative of an LSI system in its algorithm that considers the relationships between words within content of the site itself (see <font face="Arial" size="2">http://www.seobook.com/archives/000657.shtml) </font>but clearly "duz" or "dunce" (not sure which he meant) would have found the following which SEO Book confirms (a far more reputable source than mr. duz i might add): - search engines such as Google do try to figure out phrase relationships when processing queries, improving the rankings of pages with related phrases even if those pages are not focused on the target terms- pages that are too focused on one phrase tend to rank worse than one would expect (sometimes even being filtered out for what some SEOs call being over-optimized)- pages that are focused on a wider net of related keywords tend to have more stable rankings for the core keyword and rank for a wider net of keywordsDr. Ralph Wilson (<font face="Arial" size="2">http://www.wilsonweb.com/seo/google-lsi.htm) </font>agrees, see this post for more information, or you can take Googles word for it by reading the patent they applied for (<font face="Arial" size="2">http://appft1.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&u=%2Fnetahtml%2FPTO%2Fsearch-adv.html&r=1&p=1&f=G&l=50&d=PG01&S1=20060018551.PGNR.&OS=dn/20060018551&RS=DN/20060018551)</font>, which mr. duz obviously didnt (although he did claim to). Its quite long, but perhaps the most important portion is the following: The system is further adapted to identify phrases that arerelated to each other, based on a phrases ability to predictthe presence of other phrases in a document. More specifically,a prediction measure is used that relates the actual co-occurrencerate of two phrases to an expected co-occurrence rateof the two phrases. Information gain, as the ratio of actualco-occurrence rate to expected co-occurrence rate, is one suchprediction measure. Two phrases are related where the predictionmeasure exceeds a predetermined threshold. In that case, thesecond phrase has significant information gain with respect tothe first phrase. Semantically, related phrases will be thosethat are commonly used to discuss or describe a given topic orconcept, such as President of the United States and WhiteHouse. For a given phrase, the related phrases can be orderedaccording to their relevance or significance based on theirrespective prediction measures.Hopefully this will clarify the issue for those of us with open minds, and maybe next time mr. duz will do a little research before badmouthing others and making wisecrack comments that only make him look more foolish when corrected
<font size="2"><font face="Arial">Just four simple points nyorchak.</font></font><font face="Arial" size="2"> </font><font size="2"><font face="Arial">1. You said in your original article "Google, in fact, implemented LSI into its algorithm a few years ago and has continued to use it since". Now you are saying "Google does use a derivative of an LSI system in its algorithm". Sounds like you have changed your mind.</font></font><font face="Arial" size="2"> </font><font size="2"><font face="Arial">2. You cite Aaron Walls article as a reputable source of evidence http://www.seobook.com/archives/000657.shtml. This articles introduction currently includes:</font></font><font face="Arial" size="2"> </font><font size="2"><font face="Arial"> "Some of those well in the know attribute this to latent semantic indexing. Even if they are not using LSI, Google has likely been using other word relationship technologies for a while, but recently increased its weighting". </font></font><font face="Arial" size="2"> </font><font size="2"><font face="Arial">When first written this used to read (courtesy of Internet Archive) "Some of those well in the know attribute this to latent semantic indexing, which Google has been using for a while, but recently increased its weighting".</font></font><font face="Arial" size="2"> </font><font size="2"><font face="Arial">It seems to me as if Aaron has had second thoughts like you, however reliable sources are not normally edited without comment.</font></font><font face="Arial" size="2"> </font><font size="2"><font face="Arial">3. The Google patent you quote (and the other related patents) do not use the term LSI. If you cannot understand the difference between the Google patents and LSI then I cannot help you.</font></font><font face="Arial" size="2"> </font> <font face="Arial" size="2">4. Invoking my surname in a derogatory way as you have done signifies a middle school mentality and hence I have no intention of communicating with you further. </font>
Okay Duz, I appreciate your four "simple" points, however I must ask you the following: If you do not consider Google and other search engines word relationship technologies a derivative of LSI, what exactly would you call them? Seems that all the evidence states that they are the same in almost every way, from form to function. Just because the moniker LSI is not used does not mean the system is not doing exactly what it states in the patent. Google has never been one to divulge its algorithmic personality or drop names so to speak, so lets not get hung up on the wording. Instead, lets discuss whether the strategies Ive discussed in my article are advantageous to SEOs considering the evidence indicating the presence of word relationship systems and indexing tools within the major engines algorithms, whatever name they may have. Just curious, do you not consider SEO Book a reliable source of information? Last time I checked I had heard of Aaron, but I hadnt heard of you. I just cant help wondering what the Google patent is referring to if its not semantically indexing the content of pages, which it clearly states it is. So if not LSI, what would you call a system like this? Either way, the content of my article is relevant and accurate considering the evidence Ive seen (and of which you have yet to provide to the contrary with the exception of your own opinions) so like I said earlier, lets debate not whether the term LSI appears but whether strategies based around strengthening word relationships within page content are beneficial for SEO.