GIDForums  

Go Back   GIDForums > Web Hosting Forums > Apache Web Server Forum
User Name
Password
Register FAQ Members List Calendar Search Today's Posts Mark Forums Read

 
 
Thread Tools Search this Thread Rate Thread
  #1  
Old 15-Jan-2004, 09:25
BobbyDouglas's Avatar
BobbyDouglas BobbyDouglas is offline
Regular Member
 
Join Date: Aug 2003
Posts: 789
BobbyDouglas has a spectacular aura aboutBobbyDouglas has a spectacular aura about

Blocking Spambots by .htaccess


When SPAMBots come to your website, they are coming for e-mail addresses so they can later spam those e-mail accounts. A simply way to correct this is to block them from your site. Here is a sample way to do this.

All code goes into the .htaccess file in the public_html directory.

Blocking by referrer
Code:
RewriteEngine On RewriteCond %{HTTP_REFERER} ^http://(www\.)?whois\.sc [OR] RewriteCond %{HTTP_REFERER} ^http://(www\.)?ctechld\.com [OR] RewriteCond %{HTTP_REFERER} ^http://(www\.)?hyperspin\.com [OR] RewriteCond %{HTTP_REFERER} ^http://(www\.)?HostItCheap\.com RewriteRule /* http://www.fakedomainofyourchoice.com/ [R,L]

A bit of explanation:
The ^ character stipulates that the referrer string must begin with the text that follows it.
The (www\.)? part means that the rule is true whether or not the www is present in the referrer string. (ie. It will catch http://banned.com and http://www.banned.com)
Note that [OR] is placed at the end of each line except the last one.

Sometimes the referer is fine, however, the bot still comes. Here is a way to ban bots by user agent.

Code:
RewriteEngine On RewriteCond %{HTTP_USER_AGENT} ^BlackWidow [OR] RewriteCond %{HTTP_USER_AGENT} ^Bot\ mailto:craftbot@yahoo.com [OR] RewriteCond %{HTTP_USER_AGENT} ^ChinaClaw [OR] RewriteCond %{HTTP_USER_AGENT} ^Custo [OR] RewriteCond %{HTTP_USER_AGENT} ^DISCo [OR] RewriteCond %{HTTP_USER_AGENT} ^Download\ Demon [OR] RewriteCond %{HTTP_USER_AGENT} ^eCatch [OR] RewriteCond %{HTTP_USER_AGENT} ^EirGrabber [OR] RewriteCond %{HTTP_USER_AGENT} ^EmailSiphon [OR] RewriteCond %{HTTP_USER_AGENT} ^EmailWolf [OR] RewriteCond %{HTTP_USER_AGENT} ^Express\ WebPictures [OR] RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [OR] RewriteCond %{HTTP_USER_AGENT} ^EyeNetIE [OR] RewriteCond %{HTTP_USER_AGENT} ^FlashGet [OR] RewriteCond %{HTTP_USER_AGENT} ^GetRight [OR] RewriteCond %{HTTP_USER_AGENT} ^GetWeb! [OR] RewriteCond %{HTTP_USER_AGENT} ^Go!Zilla [OR] RewriteCond %{HTTP_USER_AGENT} ^Go-Ahead-Got-It [OR] RewriteCond %{HTTP_USER_AGENT} ^GrabNet [OR] RewriteCond %{HTTP_USER_AGENT} ^Grafula [OR] RewriteCond %{HTTP_USER_AGENT} ^HMView [OR] RewriteCond %{HTTP_USER_AGENT} HTTrack [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Image\ Stripper [OR] RewriteCond %{HTTP_USER_AGENT} ^Image\ Sucker [OR] RewriteCond %{HTTP_USER_AGENT} Indy\ Library [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^InterGET [OR] RewriteCond %{HTTP_USER_AGENT} ^Internet\ Ninja [OR] RewriteCond %{HTTP_USER_AGENT} ^JetCar [OR] RewriteCond %{HTTP_USER_AGENT} ^JOC\ Web\ Spider [OR] RewriteCond %{HTTP_USER_AGENT} ^larbin [OR] RewriteCond %{HTTP_USER_AGENT} ^LeechFTP [OR] RewriteCond %{HTTP_USER_AGENT} ^Mass\ Downloader [OR] RewriteCond %{HTTP_USER_AGENT} ^MIDown\ tool [OR] RewriteCond %{HTTP_USER_AGENT} ^Mister\ PiX [OR] RewriteCond %{HTTP_USER_AGENT} ^Navroad [OR] RewriteCond %{HTTP_USER_AGENT} ^NearSite [OR] RewriteCond %{HTTP_USER_AGENT} ^NetAnts [OR] RewriteCond %{HTTP_USER_AGENT} ^NetSpider [OR] RewriteCond %{HTTP_USER_AGENT} ^Net\ Vampire [OR] RewriteCond %{HTTP_USER_AGENT} ^NetZIP [OR] RewriteCond %{HTTP_USER_AGENT} ^Octopus [OR] RewriteCond %{HTTP_USER_AGENT} ^Offline\ Explorer [OR] RewriteCond %{HTTP_USER_AGENT} ^Offline\ Navigator [OR] RewriteCond %{HTTP_USER_AGENT} ^PageGrabber [OR] RewriteCond %{HTTP_USER_AGENT} ^Papa\ Foto [OR] RewriteCond %{HTTP_USER_AGENT} ^pavuk [OR] RewriteCond %{HTTP_USER_AGENT} ^pcBrowser [OR] RewriteCond %{HTTP_USER_AGENT} ^RealDownload [OR] RewriteCond %{HTTP_USER_AGENT} ^ReGet [OR] RewriteCond %{HTTP_USER_AGENT} ^SiteSnagger [OR] RewriteCond %{HTTP_USER_AGENT} ^SmartDownload [OR] RewriteCond %{HTTP_USER_AGENT} ^SuperBot [OR] RewriteCond %{HTTP_USER_AGENT} ^SuperHTTP [OR] RewriteCond %{HTTP_USER_AGENT} ^Surfbot [OR] RewriteCond %{HTTP_USER_AGENT} ^tAkeOut [OR] RewriteCond %{HTTP_USER_AGENT} ^Teleport\ Pro [OR] RewriteCond %{HTTP_USER_AGENT} ^VoidEYE [OR] RewriteCond %{HTTP_USER_AGENT} ^Web\ Image\ Collector [OR] RewriteCond %{HTTP_USER_AGENT} ^Web\ Sucker [OR] RewriteCond %{HTTP_USER_AGENT} ^WebAuto [OR] RewriteCond %{HTTP_USER_AGENT} ^WebCopier [OR] RewriteCond %{HTTP_USER_AGENT} ^WebFetch [OR] RewriteCond %{HTTP_USER_AGENT} ^WebGo\ IS [OR] RewriteCond %{HTTP_USER_AGENT} ^WebLeacher [OR] RewriteCond %{HTTP_USER_AGENT} ^WebReaper [OR] RewriteCond %{HTTP_USER_AGENT} ^WebSauger [OR] RewriteCond %{HTTP_USER_AGENT} ^Website\ eXtractor [OR] RewriteCond %{HTTP_USER_AGENT} ^Website\ Quester [OR] RewriteCond %{HTTP_USER_AGENT} ^WebStripper [OR] RewriteCond %{HTTP_USER_AGENT} ^WebWhacker [OR] RewriteCond %{HTTP_USER_AGENT} ^WebZIP [OR] RewriteCond %{HTTP_USER_AGENT} ^Wget [OR] RewriteCond %{HTTP_USER_AGENT} ^Widow [OR] RewriteCond %{HTTP_USER_AGENT} ^WWWOFFLE [OR] RewriteCond %{HTTP_USER_AGENT} ^Xaldon\ WebSpider [OR] RewriteCond %{HTTP_USER_AGENT} ^Zeus RewriteRule /* http://www.fakedomainofyourchoice.com/ [R,L]

What would be nice now is to research where each bot actually comes from, and then have the robot go their and collect the info their and have them spam their own servers.

Anyone know of any cons in doing this?
__________________
Mr. Bob's Web Design - Tirelessly looking for ways to enhance the customer base of your business.
  #2  
Old 15-Jan-2004, 09:56
Allowee's Avatar
Allowee Allowee is offline
Regular Member
 
Join Date: May 2003
Location: The Netherlands
Posts: 339
Allowee has a spectacular aura about
I've seen a bot script once.. it was small and only visited pages (like google) and didn't look for email addresses.

but the thing I saw there was that they just put a common used browser as User Agent.
That means that it's really hard to find those bots, they just look like normal visitors..

and the whois pages check is also not going to work for most bots..
they can just get that information from a lot of servers around the world, if one gets blocked they just use another.

But I must say that the idea to stop but by redirecting them to another place is great
__________________
Pastebin
PHP Documentation Site
Allowee's Blog http://allowee.net
  #3  
Old 16-Jan-2004, 10:32
cs2 cs2 is offline
Member
 
Join Date: May 2003
Location: California
Posts: 107
cs2 will become famous soon enough
Quote:
Originally Posted by BobbyDouglas
Anyone know of any cons in doing this?
The only potential problem I see is in your choice of forcing a redirect. It is amazing some of the seemingly idiotic domain names that are actually registered. And even if the domain you choose to redirect to is not registered today, is no guarantee it won't be registered tomorrow.

I suggest changing your rewrite rule to just generate a 403 error:
Code:
RewriteRule .* - [F,L]
__________________
The Whole Internet, LLC
Visit our Homepage, -or-
use our online CSS Editor
 

Recent GIDBlogLast Week of IA Training by crystalattice

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
.htaccess on Windows2k, and the trailing slash Pedster2k Apache Web Server Forum 6 24-Jan-2004 17:32
Overriding .htaccess toadatrix Apache Web Server Forum 2 02-Jan-2004 22:00
.htaccess language / rewrite problem spinflip Apache Web Server Forum 5 15-Dec-2003 14:14
.htaccess - allow IP addresses- how to? dadpups Apache Web Server Forum 3 18-Nov-2003 09:01
Problem with .htaccess Zergus Apache Web Server Forum 7 29-Jul-2003 11:01

Network Sites: GIDNetwork · GIDWebHosts · GIDSearch · Learning Journal by J de Silva, The

All times are GMT -6. The time now is 06:18.


vBulletin, Copyright © 2000 - 2008, Jelsoft Enterprises Ltd.