- 64
- Sphinn It!
Posted By: jeffquipp 147 days ago
Topic Type: News Story (Jump to http://searchengineland.com)
Category: Google SEO
13 Comments
13 Comments
Save the date for:
SMX China (Nanjing) - Sept. 23-24
SMX Stockholm - Sept. 23-24: See who's speaking or register now.
SMX East (New York City) - Oct.
6-8: See the agenda or register today and save!
SMX London - Nov. 4-5: Pre-agenda rate now available. Click here.
Comments
Wow - this has lots of implications. Looks like everyone should take a good look at their robots.txt and make sure Google doesn't go crazy indexing your site.
Mark, Google's doing this in a limited fashion. You can do a site:seoroi.com or site:brianchappell.com search for examples. Here's the official Google quote:
"We are also mindful of the impact we can have on web sites and limit ourselves to a very small number of fetches for a given site."
If you'd rather not have it at all, I believe that Mike vanDeMar (smackdown.blogsblogsblogs.com) will be sharing his code as to how to block this dynamically - it's a pretty smart solution.
Also, my apologies to Google for falsely accusing them of leaking Google Analytics data into the index.
Also, here's Matt's announcement on the issue: http://sphinn.com/story/40122
Seriously this is scary. From an SEO point of view it is unnecessary - if the site owner wants stuff indexed they should make it available. From all other points of view I see this as a real problem.
1. Marketing uses forms all the time to capture lead information. And they measure it. Let's say Google tries to 'fill out' the form and get to the inside information once a day for a month. That is 30 hits on the form that were completed but did not convert. Kind of messes with the data.
2. No matter what they say would you, as a security/risk manager, be satisfied that Google isn't going to try and 'guess' the login or password just because the button or form element has something familiar on it like 'password'. A lot of personal information (aka healthcare) and business information (aka intranet access) lie behind forms.
3. What if the form is about gathering demograhpic information. I am not sure that the demographics of a search engine or the demographics of some bot with keywords that are randomly chosen is information an advertiser is trying to capture.
So now we get GoogleBot instead of Donald Duck as the false name on our forms.
Bad News.
Its a very good improvemnet.
Way too many qualifiers there to count on it happening and hope for useful crawls of important pages, rather than converting to CSS dropdowns to replace the forms fields you want crawled.
Here are some excerpts from the Google Blog post:
Seriously, with all of those qualifiers, it almost sounds as though it would affect only a handful of sites at best.
I'm curious is this means we should eventually look at optimizing our forms. Not to get carried away and turn our forms into spam. But perhaps to write "I'm interested in your SEO work" instead of just writing, "I'm interested in your services."
On the face of it as a search engine user I might appreciate it.. but it sort of seems unnecessary .. it is not innovative in the larger sense of the word..
itravin... what, you mean like innovative..?
Sounds like this explains how Google's been indexing Wordpress search results.
it's just another autobot so if you don't want to accept auto-submitted forms add a form field and hide it from users via CSS. If it comes in completed, trust it's a bot and should not be accepted (or better, should be answered with the apprpriate response, whatever your SEO brain decides that response should be).
One of the websites where Google does this is Lyrics.net, owned by one of my coworkers. I have published some findings on my blog:
http://www.lunchpauze.com/2008/04/googlebot-wtf-are-you-doing.html
iBrian, I think that's the explanation for several sites, yes.