[clue-tech] SPAM trap

Jed S. Baer cluemail at jbaer.cotse.net
Sat Dec 13 17:11:59 MST 2008


On Sat, 13 Dec 2008 10:41:41 -0700
Bob Meetin wrote:

> I have some spam-checking in my forms, also a simple script that
> catches/logs IP addresses of spammers.  I can add the IP addresses to
> deny lists or even make up a list to work with my forms to prevent
> their access to a page or possibly the site itself, but knowing that
> spammers can change their IP easily I would like to use a wildcard
> approach. So say I have:
> 
> 68.63.83.12
> 
> Is there a PHP script that I could include in my forms that will do a
> lookup against a master list and if found, deny the cretin?

Well, if you want to process things server-side in PHP, then there are
all kinds of things to do, depending on how much coding you feel like
doing. I wouldn't do it in the form code. I'd do it as early in the
request as possible, which means having your web server reject from
offending address blocks. Your script that catches and logs the IP
addresses of spammers could easily add them to an .htaccess file.
An .htaccess file could get rather long and unwieldy over time, so you'd
probably want to optimize it by using address blocks. And then insert a
new line only if the address isn't already covered by an existing entry.

> Option B
> would be to manually do a lookup of the IP address (against some online
> DB) and if tracked to, say Bangladash, Africa, remote China or even
> South Florida, then manually add to my deny list. Most of my clients
> are small, do local regional business.

Well, also in PHP, you can do a DNS lookup, using an RBL as your
nameserver, though things like spamhaus and spamcop are for e-mail spam.
Don't know of a similar service that looks at typical culprits for
spambotting web-based forms.
 
> If I am able to track the culprit to one of these distant places would
> it make sense to add the IP root ( 68.63.83 ) to my db and not the
> complete IP address?

If you're maintaining your own DB, then you can use whatever pattern
matching code you want, and write your code for partial IP matches.

If I were doing it, I'd consider using allocated address blocks, and CIDR
notation. I don't know that there's really a notion of an "IP root",
since IP addresses don't subdivide (well, they can) just on the dots.

You can also look up the IP address, and block simply based on country of
origin. Not sure the best way to do this -- maybe querying ARIN. You'd
have to figure out to form your query to get back the results you want,
and parse the data out of it, unless ARIN (or a similar service) has a
nice XML-RPC protocol where you can just XPATH (or something like that)
the value directly from the returned data package. Worst case, I think
would be some ugly PHP code such as:

<?php

	$foo = `whois 68.63.83.12`;
	// then parse $foo and do whatever

?>

Where maybe even some simple thing like a regex match on typical things
such as "APNIC", if you want to just ignore the whole Asia/Pacific area.

Or use the CURL library to send http requests to ARIN (or wherever) to
get info on the IP address.

Really, the options here are pretty broad.

Hard to recommend anything specific, because your web forms could be
anything. You got a BBS, blog, web commerce ... ???

jed


More information about the clue-tech mailing list