JohnMu
@robwatts Can you elaborate on "there are some bigger issues that stem from the whole thing which could be interesting to discuss" ? What are we missing or what could we do better in a starters guide?
@Halfdeck I dont have access to details like that (and Im not even sure that the Gmail team would be able to provide it publicly either), but from what Ive seen at Google, we tend to automate things. At any rate, Im 100% sure theres nobody reading your email and manually disabling accounts for the heck of it. There are a lot of accounts out there :-). While I totally agree that having a "free hoster" email address does not give your business credibility in the eyes of many out there, I disagree that its bad to use free services like Gmail for business reasons. Services like Hotmail, Yahoo mail and Gmail process more email than any other company out there. If anyone knows how to handle email and keep it running, they definitely do. Sure, things can go wrong (and they will, I dont think it can be avoided regardless of the provider), but I feel pretty confident depending on the people who keep those services running. As someone mentioned, Google Apps for your domain does provide phone support for some versions. Also, by using your own domain name youll at least be able to re-route your email if anything should go wrong. Ive moved almost all of my close relatives to Google Apps (from email hosted on my servers and on other ISP servers) and havent regretted it one bit.
We recently blogged about the process that users can take to get their accounts back: http://googleblog.blogspot.com/2008/09/what-to-do-if-you-cant-access-your.html -- an important part (that makes everything else fall into place quickly) is keeping your verification number. Ive been using Gmail since almost the start and have not had any significant issues (except for when I accidentally tagged all my mail as spam and couldnt find it anymore :P ). Personally, Im more than happy to delegate my mail to the engineers at Google than to chase after hardware/software issues myself (as we used to do in my old company).
Hi Nick, in general its usually quicker to submit the malware review request through Webmaster Tools. Since these reviews take a bit of work on our side, we want to make sure that only the verified webmaster(s) submit them, which is why theyre within Webmaster Tools. In addition, I believe Webmaster Tools provides additional information which can help the webmaster to find and remove the bad code. FYI, heres a screenshot of what it generally looks like within Webmaster Tools: http://tinyurl.com/5to6tvOf course, if youre serious enough about your website to care about malware being placed on it, you should have verified your site in Webmaster Tools anyway - the diagnostic information can help save your site from technical issues impacting the search results, I woudnt want to miss that.
Hi Barry! It looks like I may have been a bit too fast in answering that one :-). At the moment, as far as I know, we do not look at case sensitivity in web search. However, at some time in the future, we may decide to take it into account should we find that it is relevant and improves the quality of our results.
If you see differences at the moment, it is likely due to normal fluctuations and perhaps differences across datacenters. If you find drastic differences, it would be great if you could post them here or better, in the Webmaster Help groups so that we can investigate them appropriately.
Thanks for joining us in the chat, I hope you enjoyed it as much as we did!
Thanks for putting this together, Eric. I really like how youre using the crawl rate charts to predict crawling cycles, good idea!
HTML comments are comments by and for the developer(s), not content :-). I cant think of a reason for us to use them for normal web-search. There is one place where we do use them: http://www.google.com/codesearch (it doesnt cover HTML documents though :-))
Dont forget the second one by Jonathan: http://groups.google.com/group/Google_Webmaster_Help-Indexing/browse_thread/thread/8d427cdd556f644e :)
Story: Google: Bored at the Core
I never had this much thrill and saw this much innovation at my own startup :). Seriously. This company rocks, the people here rock.
Hi Kalena - just a note regarding #5, it may not apply to all search engines, but for Google you could just return HTTP result code 503 during the time when the site is down for maintenance (using the .htaccess or whatever). Theres a note on this at http://googlewebmastercentral.blogspot.com/2006/08/all-about-googlebot.html Of course I wouldnt do this for 3 weeks over the holidays :-) (I used to see quite a few of those)One other search-engine-style mistake (imo) is to have the server return full error messages to all clients when something goes wrong (and not returning 500)... how many SQL errors have your clients gotten indexed?
Story: Rhea Gets Married!!!
"Top 10", gotcha, JohnWeb :-). Ill try to keep that in mind should I want to post something mainstream.
Good idea, Sebastian. It could be a bit problematic though, imagine someone accidentally sets up a proxy like that next to his normal site (or someone does it "for" him). On the other hand, thats like many other things - youre responsible for your own site, period.
@Sem-Advance the problem with the "Date of inception" is that people could go around and take over new or low-value (partially unindexed) sites by re-publishing their content on a higher-value domain. Thats the same problem that plagues most other technical methods to recognize the original owner: if the creator if the copy is technically up-to-date, they could do whatever is necessary to register the content before the original owner has a chance to do that (or even knows to do it).
Just from looking at the tool on the outside, I think it does the following:
- access several proxies with a fake user-agent to get them to cache the page (assuming they need that and dont grab it on the fly)
- push links to those "proxified URLs" to blogs, either as automated comment spam or as postings on specially crafted splogs (blog + ping), in a way that Google recognizes the links and follows them.
In order for the proxies to take your site out of the serps (assuming it works the way they say), they would have to rank above your site, have more value than your real site. I can see how that might be an issue with zero-value sites, but any site that is already indexed will surely have more value than a URL that is fed only with comment or blog-spam?
(or am I missing a vital element?)
If it worked ... it would mean that automated blog comment spam would work. When was the last time that automated comment spam beat you in the serps?
Wow! Thanks, everyone :)
So... what message can I bring in to Google from you all?
Thanks for sphinning, JohnWeb! This is going to be exciting, I cant wait :-).
YADAC is right, but why do we have to debate it YA? Why does it take that much debating before it gets taken seriously be the webmasters / site owners and the search engines? Whats wrong with only getting the sell-page with the abstract indexed? It could be a matter of a spam-report, getting that handled and done. (heh, I wonder if anyone did that :-))
Regardless of the kind of publisher and the current rules:
Should subscription / paid content be included in the normal web search results (not just the abstract pages, but the full content crawled and indexed, but not viewable for users)? What if it were tagged and usable as a filter like in Google News Archives?
I agree with Pierre and am for a level playing field, at least for the main web search index.
Google News Archives tag paid content; I like that, you know what you are going to get, no false expectations. I dont know about Google Scholar, but I hope they tag paid content as well. This kind of cloaking just doesnt belong in the main web search index, in my opinion. If they want, they could just have their abstract pages indexed like everyone else with paid content.
If WMW were to get into Google Scholar, would they also be allowed cloak content to the engines and redirect users to signup-pages?


Story: Google Gives out SEO Advice