Story Found By: MikeDammann 1158 Days ago
Category: Sphinn Zone
Robots that do follow that link get automatically added with Deny statements into your sites .htaccess file.
Sometimes legitimate web robots get out of sync, so to make the script able to run unattended, I recommend that you whitelist those in your .htaccess file.
7 Comments


Comments
Same here and many other people that I know.
Great article, Ive been looking for a way to stop the scrappers from jacking my data.Who other than the major search engines would you add to your white list? Will you be hurting your rank if you keep the scrapers from grabbing your content?
Hmm you can ask in that forum or I ask for you if you want.
I just asked, stay tuned for a response :)
Great way to protect your content from pirates. Thanks a lot!
Thanks :)
I wrote my own bot detector. If the bot retrieves too many pages in too short amount of time, I block the ip. Youve got to be careful with certain bot prevention methods, because you dont want to block the good bots like the search engine spiders.