Blocking Maliciuos bots from accessing your website
It is always a nuisance when unnecessary bots start hitting your website. It can increase CPU load on your server and may cause MySQL overhead as well, if your website driven by any database. Though the bot access says to be controlled using robots.txt files, most malicious bots do not honor rules defined in robots.txt file. The most reputed bot crawling( like google bot ) can be controlled by robots.txt file or via the web master tools. But for the rest of the bots, the best bet is to block them if they are hitting your website hard.
Following is a sample .htaccess rule, that will help you to block specific bots from accessing your website.
RewriteEngine On
SetEnvIfNoCase User-Agent “BOT” bad_agent
Deny from env=bad_agent
Please note that you should replace “BOT” with corresponding BOT name. For example, the following lines shows access from Bing Bot and Baidu.
Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)”
Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)”
So to block the above two bots, we can add the following lines to .htaccess file of the respective website.
RewriteEngine On
SetEnvIfNoCase User-Agent “bingbot/2.0” bad_agent
SetEnvIfNoCase User-Agent “Baiduspider/2.0” bad_agent
Deny from env=bad_agent
In such way, add “SetEnvIfNoCase” line with corresponding BOT Name before “Deny from” entry for each bots you want to block and that will do the trick.