Published: Oct 21, 2008 - 01:04 am
Story Found By: johnandrews 1670 Days ago
Category: SEM
CAUTION! Any time a company asks you to customize your web publishing for them, especially if that process includes adding their BRAND to your web pages, you should think carefully before acting.
By branding your pages with SEOMOZE, you demonstrate their INFLUENCE over you, and trust me, that influence will be leveraged further in the future. If Randy needs to demonstrate the reach and importance of his LinkScrape project, for example, he could simply point to how many web pages in Google specifically mention his "crawler". Your web pages will be counted as evidence of SEOMOZs success.
Never, ever add someones brand to your web pages unless you want to actively support them.
Please discuss if you agree/disagree. I cant see any reason why SEOMOZ would ask for meta tags, except as a way to further benefit from the activity.
For reference, Archive.org publishes bot-blocking instructions *and* checks the robots.txt directive before displaying archived web data, as a means of providing privacy for those who want it. No meta tags.
26 Comments




Comments
When I read about the seomoz meta tag, I remembered the footprint the digital point link coop left behind. I dont want my sites targetted via footprints.Whats so hard with following robots.txt?
nickwilsdon answered my question on why its so hard to follow the robots.txt:http://sphinn.com/story/80142#c56044in short, they dont control the crawling.
@merrick - thats not entirely accurate. We control the vast majority of crawling, but if/when we backfill with data from other providers, we need to make sure that those sources dont contain pages people would want removed from our list of links. Thus, a meta tag on the page level is the best way to go.
Ok so assuming I did block my own websites surely all my competitors could still see my link information because I am unable to block crawling of all the sites that link to me? Correct?<div></div><div>If so then there seems little point in blocking it.</div>
@MerrickAs Donna said, you can block DotBot with robots.txt. This spider seems to be the main robot providing data for Linkscape (and the one Moz "controls") User-agent: dotbot Disallow: / However your site information can still be included from the other sources Linkscape buys data from (or free API data from major search engines). As Rand suggests, the META is the only way they can ensure you are removed.
@patrickaltoft - yes, thats right. Youd need to get all of your link partners to block in order for us not show links to your sites/pages/@nick - yes, meta is the only way youll ensure removal from the results right now, though we may build in blocking for sites that verify in the future.
@nickwilsdon got it@randfish thanks for clarifying, I understand why a meta tag is needed - its just not ideal.
Do I hear the x-files theme playing?
Now if you did have a certain network of sites you didnt want competitors to know were linking to a certain site then you would block it on all those sites, regardless if other sites you had no control over were linking to the certain site. Linkscape would only show a portion of what was linking to what.And some IP cloaking could prevent meta tag footprints, perhaps.Just thinking out loud.
<div style="font-family: Verdana, Arial, Helvetica, sans-serif; font-size: 10px; background-color: #ffffff">Even if SEOmoz doesnt control the crawling of all the pages, surely its fairly trivial to get the robots.txt once a week or so for each domain discovered and drop the data from those that want to be excluded.</div>
@liamvictor, maybe if you controlled a single site and didnt have anything better to do with your time. Unfortunately I dont care to spend any amount of time making up for someone elses attemt to make money off my data, and certainly not once a week for every single site I own.
@Skitzzo Perhaps I dindt make myself clear, Im not suggesting publishers do anything apart from add a rule to robots.txt, just like we do with any other spider we want to exclude.<div></div><div>Im suggesting that SEOmoz get the robots.txt themselves and promptly exclude the sites that dont wish to be in their index. They can easily get this regardless of if they actually spider the pages themselves and can therefore exclude pages much quicker than the 60 or so days Rand spoke of.</div>
<div style="font-family: Verdana, Arial, Helvetica, sans-serif; font-size: 10px; background-color: #ffffff">Err, the "once per week" reference wasnt for you to do anything at all.<div></div><div></div><div>It was in reference to bots maybe checking it once per week.</div><div></div></div>
@g1smd Exactly. For *us* it would be a set it and forget it change to the robots.txt
OK, can somebody explain what am I seeing here ? :http://ws.arin.net/whois/?queryinput=seomoz.orgOrgName: seomoz.orgOrgID: SEOMOAddress: dotnetdotcom.orgAddress: 93 S. Jackson Street 10070City: SeattleStateProv: WAPostalCode: 98104-2818Country: USComment: RegDate: 2008-07-07Updated: 2008-07-07AdminHandle: NGE11-ARINAdminName: Gerner, Nick AdminPhone: +1-206-299-9628AdminEmail: admin@dotnetdotcom.orgTechHandle: NGE11-ARINTechName: Gerner, Nick TechPhone: +1-206-299-9628TechEmail: admin@dotnetdotcom.org
@liamvictor ahh, my bad. I thought you meant we would have to once a week update our files to keep up with any new data sources they release. Sorry about that!
I feel bad when even all you MENSA brainiacs cant protect yourselves from yourselves...
Since Linkscrape is making the profits, have Linkscrape check the robots.txt for permission *before* delivering reports. Just like archive.org.... its not a problem with the so-called "crawl", it canbe blocked at the service delivery point.@neyne yeah, theres apparently some funding of IPs going on.
This might be a tad paranoid, but going with the footprint theory I certainly wouldnt add anything SEOMOZ to any of my clients site. Calling attention to yourself or to your clients that they are under "SEO supervision" especially after the past issues with SEOMOZ having no problem playing "tattle tale" on other sites like a certain directory that shall go unnamed... doesnt sound like such a great idea.It seems like they make money off of your data one way and if you try to opt out (which it seems like you really cant) in the future they or the companies they are getting their data from (you know small companies like Yahoo, Google, etc...) could use those "lists" against thousands of sites. Think about what happened with link exchanges and the big daddy update in 05...Its sad when SEOs appear to be trying to capitalize and cannibalize on not only other SEOs but it appears the industry as a whole.
I have more than a hundred site hosted on a server. Will Linkscrape take up my server resources the way major search engines and spambots did?
Adding a meta tag definitely suggests that SEOMOZ is crawling the Web. So what was all that endless argument about SEOMOZ using somebodies elses data all about?Not only is SEOMOZ not important enough to add a special meta tag, but my concern is that any use of NOINDEX might cause pages not to be indexed by the major search engines. If SEOMOZE wanted to use a meta tag then they should have used something like: noSEOMOZ.If SEOMOZ has their own bot then the robots.txt would obviously do the job.My next line of defense are WordPress plugins that are supposed to protect my blog against evil bots.
This conversation, period (along with the half-dozen Linkscape-related stories on the front page of Sphinn), gives the SEOMOZ brand as much recognition than a few invisible meta tags ever will - and isnt brand recognition the only thing at stake here?"Their INFLUENCE over you" = brand recognitionThats all. Dont these comments represent the SEOMOZ brands influence over all of us? Does that make everyone supportive?
@claye: Theres no question Randy and SEOMOz are influential in search marketing - you cant take that away from them. They deserve their brand, no question. But this is a bigger picture. If the fix for a negative problem is putting SEOMOZ into your headers, then the fix is making the problem bigger for the long term. I think we need to resist supporting such negative actions with endorsement.Talk about SEOMOZ and Linkscrape as much as the project deserves. Im still amazed Randy hasnt turned this into a net positive for his company and Linkscrape, but thats his business.
But if we dont put SEOMOZ in our robots.txt, then how are we going to stop Linkscape from crawling our server?
@johnandrews curious why you call Rand "Randy."
@SEMSEO robots.txt is not the same as meta tags... an exclusion in robots.txt is clearly a negative vote, while a meta tag is a supportive vote. Thats all this post was about... being careful not to accidentally endorse a service when you actually wanted to block it.@jill thats the second time Ive been asked that... is that not his name? I called him that when I met him in Portland and he didnt seem to mind.I sense that now that things are clarified, some might be looking for more in these discussions. Sorry... its all about clarity, not "personal agendas". I know thats not as colorful as some would like things to be...