eKstreme
Something smells in that article... kinda like... Seriously, great read :)
Thats right Jill. Its a great next step to take this kind of analysis forward.Because we know when the data was collected, we can also extract tag clouds and figure out how people tweet about news in real time. One kind of tweeting is linking to news sources but it can also be keyword mentions, etc.Its a huge data pile that just needs a bunch people to think up and write analyses theyd like to see.Pierre
Well a lot readers here on Sphinn are no doubt beginners and reading some of the things on the list is important to them. Its boring to more seasoned SEOs, sure, but were not the only demographic here. And thats why we see a gazillion repeats make it to the top: its the tyranny of the majority of this community. On a more technical level, the algo Sphinn implements to rank posts and make them go popular is not necessarily the best one. There are many alternatives and maybe its worth experimenting with those.Pierre
Microsoft does this too - at least they used to. Dont act so surprised that Google is a big evil entity: its a normal tech company.
Story: Matt Cutts Selling Links?
Is Matt selling links? I sincerely doubt it. I agree with Tamar that there is nothing serious here on that angle. And yes there is a paid links penalty. Experienced it personally.But this submission does raise a simple question: given a blog like Matts which talks about the search industry 95% of the time, how can a search engine figure out that the remaining 5% "fun" posts are really the bloggers own and not paid posts or links?Thats a thought for a good Sunday morning :)Pierre
Awesome news for both Donna and SEP. A big round of congratulations deserved for both of them. SEO-Scoop cant be in better hands!Pierre
Sun should buy Yahoo!, merge it with AOL, and then MS should buy the combined thing they form, and well start hoping they all go out of existence and take their crappy products with them.Come on, let competition rule the market, and might as well enjoy it.
Story: Is SEO Blogging Worth It?
Is blogging worth it? Depends on the why youre doing it, as Jill mentioned. If it is:For money from ads, hell no its not worth it.For building a reputation to attract clients, definitely yes if you do it right.For Engaging the SEO community (just like in any other industry), then definitely yes.For your innate love of writing, then yeah I guess youre getting more satisfaction than not writing.There are other reasons to blog and you need to think of it as an activity and what youre getting out of it. Shocking, isnt it.As for a lot of SEOs regurgitating same old stuff, then read some other blogs :) One tip: as an SEO, start reading blogs about adjacent topics like usability or web programming or whatnot. Youll be amazed what you can see as relevant from theser related fields and start applying them to your work.My 2 cents.Pierre
Nice. I put together a search training pack for our company and it included lots of examples like these.Also try: [salary filetype:.xls site:.org] and also [salaries...]. Contact information/details are also good examples.Another thing is to get creative with keywords. The word [highly] is a good modifier (highly confidential, highly innovative, etc). And [press release] is another good one. What else can you think of?And dont get me started on how useful this kind of searching is for social engineering.
Well done @neyne!On the way to bed last night, I thought this would be a great search to do. I was too tired and said, nah, SEOmoz wouldnt do that as they know about it. Theyre in the industry and surely they wouldnt have overlooked this loophole.
Rand, you hit on the head:Were tenacious about including data, but only to a point. A webmaster dedicated to keeping us out certainly could through the method you described.I dont want you on my site and I dont want to have to fight you off. Linkscape is not being polite because its going after my data by hook and by crook and I have to spend precious time to fend it off. Its no different than an MFA scraper: instead of making money from ads, youre making money from subscriptions. Theyre in the same bucket on that front.Your statement clearly shows to me that you do not want to be polite. All the bots listed on your sources pages are polite but they serve me well: Google and Yahoo send me traffic. What does Linkscape do for me? Nothing good. On the contrary, it wants to hurt me by giving my competitors a leg up.This the heart of the problem: I dont trust your tool right now. I dont like the fact youre not honest about how to effectively and efficiently block it. We dont have time to fight the bad bots and you. Its all about politeness and Linkscape is enroaching on my territory, namely my sites. And it has no business being there and Id like to go away. Im happy to add something to robots.txt and thats it. Let me state this again clearly: Linkscape is not welcome on my sites and the onus is on SEOmoz to make it easy for me to keep it away.
Re noarchive:SEOmoz needs to see the HTML of the page, at the very least to see the SEOmoz-specific meta tag, right? Thats one.Two: SEOmoz keeps talking about their crawlers, which are really bots that SEOmoz does not control. You keep talking about about things like dotbot and Y! Slurp and GoogleBot.So if we block dotbot and the rest, we block SEOmozs access to our HTML. But I dont want to block GBot just because of some pesky tool, but I can tell GBot and Y! Slurp not to cache the page with noarchive. Ill happily block dotbot and other unimportant bots.End result: you dont have access to our HTML unless you crawl it yourself. And if you do, well find it and block that too.Of course, I can know if Im right or wrong because you are not giving us straight answers. Dont be surprised if we react in kind.Pierre
Here is another yes/no question:Would using a meta tag to state noarchive, as specified at http://www.google.com/support/webmasters/bin/answer.py?answer=35306 , also remove our pages from Linkscape?Pierre
No no, Rand. Question 1 was about the robots that you control, not third party sources. And for a robot, its a piece of code that uses your bandwidth to download data from websites onto your computers. Not third party robots, but SEOmoz robots.If you have a piece of code that goes through pages that you already have (an index, a set of files, cached copies, whatever you want to call them), whether you downloaded them or got them from 3rd party sources, then that is NOT a crawler. Its a parser. Thats not a symantic difference but an important technical fact that needs to be stated clearly. A crawler is only one component of the system that retrieves (crawls) pages from the internet, stores them, analyzes them, and calculates link metrics.As for 1b, the link you provide does not show any UA for a robot that SEOmoz owns and so you do not, according to the page, own a crawler. This is in contradiction to the Yes answer you have to question 1. And I find it very ironic (if also a touch rude) that a tool sold as a competitive intel tool is being secretive for competitive intel reasons.Pierre
I have a simple set of questions for SEOmoz to cut through the chatter and get to the bottom of the technical details of the Linkscape bots and index. 1. (Yes/No) Does SEOmoz control computers that contain a web robot that retrieves data for Linkscape? Computers are defined as physical hardware that SEOmoz owns or virtual services that SEOmoz has access to such as Amazon Web Services or hosting accounts? If yes: 1a. Does the web robot retrieve and obey the robots.txt? 1b. If it obeys robots.txt, which User-Agent string does it respond to? 1c. What is the HTTP USER AGENT of the robot? By HTTP USER AGENT, I mean the HTTP header that is sent with each request according to the HTTP 1.0 or 1.1 protocol specifications. 2. (Yes/No) Does SEOmoz get hold of data that robots outside its control retrieve? By outside of its control, I mean robots that are built, run, and maintained by other companies. Getting hold of data can involve using an API, buying the data on disks or retrieving it online. If yes: 2a. Which robots data does Linkscape currently use to build its index? I dont care about the potential for future use, I care about now. 2b. Which robots data has Linkscape used in the past since its inception? So straightforward questions that need clear cut simple answers. Thanks,Pierre
Rand, you did NOT disclose a UA. A user agent is defined in the HTTP protocol (e.g. version HTTP 1.1 at http://cli.gs/QqayjG ). You disclosed a meta tag that only an HTML parser can read. These are very different things and its not a semantic difference.As for a crawler, I agree with the comments above that the wording suggested there is a spider that SEOmoz controls, but you know that I didnt think thats true (you saw my blog post). The way I understand LS to work is that SEOmoz gets hold of downloaded pages from around the web, say an index of cached pages commercially or freely available. Those are then parsed on SEOmoz computers and the link graph calculated on SEOmoz computers. Right or wrong? If wrong, whats the exact method of operation? Again, a parser and a crawler are very different things that are not just semantically different.As for any suggestions that SEOmoz, or you personally as Rand, were driven by any "malicious" motives (for the lack of a better word), I think thats just plain wrong. I think as a marketer you could have handled this better but I dont see any evidence of malice.Cheers,Pierre
I knew it! Come on, out with it, whos been using social media for marketing purposes?
Welcome to 2008 Google. Now please the terms and conditions of use sensible, like not blocking SEO-related queries.Pierre
Story: The SEOmoz Linkscape Ghost
IncrediBILL: I didnt miss your post. DotBot could be the seed index. On the DotBot website ( http://www.dotnetdotcom.org/ ), they state their index has 10,355,148 pages so far, which is a far cry from the claim SEOmoz makes of 30+ billion. The domains count is also much much smaller.So DotBot could be part of the story still.
The only acceptable method is to fully disclose the UA. Not a string to be used in robots.txt but the full user agent.<div></div><div>Anything else is just scraping on the same level as spammers. Think about that.</div>
Has using nofollow decreased anyones spam? I doubt it discourages any form of spam.
@onreact: Cligs saves the analytics for your short URLs. To keep them private, they have to be password protected.Pierre
Clig deletion is top of the to-do list. Also editing the cligs details.And yes there is a new set of features that help you organize Cligs better. Its not folders though...


Story: Why You Should Look To Duality With Your Link Bait Methods