Christmas is around the corner and I had only one wish from Google. Why allowing spiders to pass referral information would be beneficial to webmasters...
6 Comments
6 Comments
6 Comments
Search Engine Land produces SMX, the Search Marketing Expo conference series. SMX events deliver the most comprehensive educational and networking experiences - whether you're just starting in search marketing or you're a seasoned expert.
Join us at an upcoming SMX event:
Learn more about search marketing with our free online webcasts and webinars from our sister site, Search Marketing Now. Upcoming online events include:
Comments
Search engines do not come to your site directly from other sites.The bot has a list of URLs to visit, and visits each one in turn, recording the date and time of that visit in a log.For each URL visited, the content of that URL is stored (if the URL returned "200 OK"), along with the HTTP status code. For most other status codes, just the HTTP status code is stored. For redirects, the target URL will also be noted.For each page visited, any new URLs found in links in the pages content are added to the list of URLs to fetch. Those URLs will be fetched later, perhaps hours or days later.In that way, there is no such thing as a referrer as far as bots are concerned.
that kind of sinks my theory :)))is there anything official on this ? I couldnt find anything...
Yeah plenty, but most of it is locked away in massive patent applications, and those use language that is almost, if not entirely, incomprehensible.I think Matt Cutts has tried to break it down into more simple steps a couple of times over the last few years. Maybe someone else can recommend some.
Yeah, I looked into his instructions. Found only references on to how to make Googlebot index/not index the pages you want and www vs. non-www issues - Hearding Googlebot type of posts. I did quite an extensive searching mission for anything that could be related to this topic and came up with nothing of significance...
I havent read these, but from their snippet they might get you started:http://highscalability.com/google-architecturehttp://www.woodpecker.org.cn:9081/doc/GoogleFS/google_clusters.pdf
nope, nothing there either. I found this:Local search robot spider indexers locate files to index by following links, just like webwide search engine spiders. You specify the starting page, and these indexers will request it from the server and received it just like a browser. The indexer will store every word on the page and then follow each link on that page, indexing the linked pages and following each link from those pages.I dont think it is very authoritative though :)Left a comment on Matt Cutts blog, hoping he will respond.Thank you for your interest, though. I think it is an intriguing issue