Published: Jul 23, 2008 - 04:52 am
Story Found By: aimClear 1291 Days ago
Category: SEO
26 Comments
26 Comments
Search Engine Land produces SMX, the Search Marketing Expo conference series. SMX events deliver the most comprehensive educational and networking experiences - whether you're just starting in search marketing or you're a seasoned expert.
Join us at an upcoming SMX event:
Learn more about search marketing with our free online webcasts and webinars from our sister site, Search Marketing Now. Upcoming online events include:
Comments
<div></div><div>There are several errors in the .htaccess file.</div><div></div><div></div><div>1. Do not mix rules from different modules, otherwise you will not be able to guarantee the order they are processed in. Redirect comes from Mod_Alias and RewriteRule comes from Mod_Rewrite. Use only Mod_Rewrite for all of these.</div><div></div><div></div><div>2. Make sure that all Redirects are processed before all of your Rewrites. Failure to do so will expose your internal filepaths to browsers and bots. </div><div></div><div></div><div>3. Make sure that the most specific Redirects are processed first, and that they fix the domain etc within their own redirect. List the general "catch-all" stuff last (like the generic non-www to www redirect).</div><div></div><div></div><div>4. Test, test, and test again - for all expected inputs and as many UNexpected inputs that you can think of (www and non-www, with and without trailing "/" on folder names, etc). Use the Live HTTP Headers extension for Mozilla Firefox to make sure that you do not have "chained redirects" as those can cause major issues.</div><div></div><div></div><div>That said, there are several problems noted with the data in WMT at the moment. There are several threads over at WebmasterWorld that have been running since the end of 2008 June with various related notes and observations.</div><div></div><div></div>
Thank you very much for your comment, I will check my website against your points and post my findings and changes.
The Apache code has been updated on the article. The basic Redirect 301s at the top are working fine, and the RedirectMatchs are also working fine. Now I need to figure out how to code lines 13-23 to remove the www and the trailing slash in one redirect. Any ideas?For instance, with the "www" going from: http://www.marcgrabanski.com/code/ui-datepicker/to http://marcgrabanski.com/pages/code/jquery-ui-datepickerTakes way too many redirects.
You still have directives from different modules.
This is a really interesting exchange between serious code-heads... :)
Pretty basic SEO QA stuff Marty. The problem that I still see is that mixed case URLs redirect to the home page and then you have a home page with "page not found" text that could throw off any keyword targeting you are trying to do on the home page.
g1smd: Ive been wrestling with this for hours and hours, yet still cannot figure out how to combine all of these rules into only using directives from one module. Please advise on how I could accomplish this?andrewsho: I could only see this being basic if you do this all the time, just like anything I guess. For most developers, this is considered a difficult task to troubleshoot because it is outside of the scope of what we normally work on or have to deal with. Ive already taken this much farther than most, to even find that there is a problem and attempt to address it with what I know and have found out thus far.You guys are pros and do this all day, which is why I have come to you for help!
1Marc, no disrespect intended - I was responding to Martys code-head intended. What I meant was testing that all versions of your urls 301 to a single version is standard SEO operating procedure. Making that happen, depending on your system, can either be simple or a total hassle. Hope you were able to figure it all out.
I have a bandage solution for now that will work, but it is not ideal. I still need to figure out how to combine the rules to use one module.
@1Marc: Andrew is my friend and a smart guy. Im sure he means no offense :).
There is no need to combine all of the rules into one super-rule.<div></div><div></div><div>The reference to modules, is to the Mod_Alias and Mod_Rewrite Apache Modules. Redirect comes from Mod_Alias and RewriteRule comes from Mod_Rewrite. Do not mix rules from different modules, otherwise you will not be able to guarantee the order they are processed in.</div><div></div><div></div><div>So, change the "Redirect" line to use "RewriteRule" instead.</div>
andrewsho: no offense taken at all.g1smd: I changed all of the Redirect lines to RewriteRules which has cleaned up most of my 301 redirects. One issue remains, I need to figure out how to make a RewriteRule to convert: http://marcgrabanski.com/tags.php?tag=FreeTools to... http://marcgrabanski.com/tag/free-toolsCould you help with this rule?
Youll need to test %{QUERY_STRING} as well as %{REQUEST_URI} because the query string is not a part of the URL. <div></div><div><div></div><div>Otherwise it is much like any of the other redirects. Oh, and as it is more specific, it goes *before* the other rules, and needs to fix www and non-www within the same rule, so that the later rules do not have to be invoked for that URL.</div></div>
g1smd: Ok, but Im still not sure how to change the Camel Case words to dashes, so Im thinking about just creating a tags.php file that does the 301 rediects and calling it a day. Everything else is working fine.
I created a tags.php file that does the 301 redirects. That seems to work just fine and is a simple solution. I think this means that things are all buckled up now for my 301 redirects. Thank you everyone for your help.Now it is time to wait to see if Google likes things on the next indexing of my website - dont you just hate the SEO waiting game?
Test using inputs that you expect, as well as those that you dont - especially from both www and non-www, and non-valid URLs. <div></div><div></div><div>Make sure there is no redirection chain. Use the Live HTTP Headers extension for Mozilla Seamonkey or for Mozilla Firefox to check out the HTTP response codes. </div>
After testing, I found only one double 301 redirect issue. This was whe "www" was combined with "tags.php". This was because the RewriteRule was removing the www, then the PHP redirect in tags.php was taking you to the correct url.To fix this, I added one exception in the htaccess, "remove www rule" which allows tags.php to do the complete rewrite:RewriteCond %{REQUEST_URI} !.*tags.php.* [NC]The new full www rewrite looks like this:RewriteCond %{HTTP_HOST} ^www.marcgrabanski.com$ [NC]RewriteCond %{REQUEST_URI} !.*tags.php.* [NC]RewriteRule ^(.*)$ http://marcgrabanski.com/$1 [R=301,L]I have searched and do not find any remaining issues. The old URLs all appear to be pure 301 redirects. Thanks again!
I thought there might be a lurking issue or two like that somewhere in the code.Testing found it, before Google did.Google would have made a mess of indexing your site had they found it first.In your new code, most of the .* patterns are redundant and can be removed (those NOT in brackets).
Never, ever use .htraccess for redirects is my personal advice. Why bother when you can use mod_rewrite and PHP to do the work?IMHO, .htaccess should be for configuration, and scripting should be for logic, whioch redirects are. That way, you KNOW where the issue is (with the PHP), rather than having to guess (scripting or .htaccess???)Also, with PHP, you can set up a system to handle and check (put requests in a DB, email yourslef links etc etc), that .htaccess isnt desigtned for.And oner last benefit: people are good with PHP, but .htraccess is infrequently used and therefore not as easy to guarantee the quality of work.My $0.02 anyway.
*** Never, ever use .htraccess for redirects is my personal advice. Why bother when you can use mod_rewrite and PHP to do the work? ***Last time I looked, the Mod_Rewrite code went in the .htaccess file - most people dont have the luxury of adding it to the httpd.conf file.Putting it in the server configuration files, either in .htaccess or httpd.conf is way more efficient than adding it to any of the server-side scripting.
You only need one rewrite rule: run everything that doesnt exist (e.g. ignore css, flash files, images etc) through a single script (in teh case of WPO index.php).Efficiency is a myth, as computers are fast, and people time is slow. Putting redirects into .htaccess has these consequences: Requires the testing of a whole extra thing.Forces someone to learn the intricacies of a new language (Apache config rules) that are not likely to be retained in their "ram", as theyd be worked on so infrequently.Causes there to be two palces errors can occiur (.htaccess and the scripting languages).Takes extra time to debug.Forces people to read through whole new sets of documentation, foten with unfamiliar terms, to debug what should be trivial isues.Do what wordpress do and run everything through one script, and make your life easy.
I can move my rewrite rules to the httpd.conf, but what is the point? I dont get why it matters if I have it in a .htaccess file, since a file can be put in SVN and easily reverted / rolled back and tracked.After using RewriteRules for a while, it seems they actually give you a lot of power that you dont otherwise have. Like chaining a bunch of rules together which all render at once (if you use the same module). Yes it takes a long time to learn, yes it is complex. But personally I spent so much time and energy on this project because the whole point of this was to learn something new. Mike: I understand what you are saying and agree for a lot of cases PHP redirects just makes sense. But, I dont think PHP redirects should be the "end all" solution as you are suggesting. Apache rewrites can be powerful once you get the hang of them. But in most cases I think a PHP script is just fine. In the end, to each his own!
Its really not that important as long as your linking is consistent at least from my experience. If you have a managed dedicated server get tech support to do it on the appache end - then just use a 301 redirect tool to make sure its SE friendly. I use rackspace and they always set it up for me when I request it - no charge..htaccess is also fine if you can do it or you have web programmers who should be able to do it in a minute or you should be hiring a new web programmer.but if none of the above is an option - just move on - its not worth loosing sleep over.
I have a managed dedicated server and can do what ever I want with it. I chose to put the rewrite rules in my .htaccess file instead of http.conf for the fact that it is in SVN. And I wont lose sleep over it.
Most of my .htaccess files run to at least 20 or 30 KB, with rules for blocking bots, setting ErrorDocument and DirectoryIndex parameters, and then invoking domain, index-file, and query-string canonicalisation, blocking HotLinks, and many other things. I like to have that layer taking care of many things, before the request has a chance to hit the server filesystem and scripts.
g1smd: Very interesting, sounds like one crazy .htacess file. Id like to examine the intracacies of a file like that.