[clue-talk] Triaging spam

Charles Oriez coriez at oriez.org
Wed Dec 15 21:52:53 MST 2004


At 12:57 PM 12/15/2004, Matt Gushee wrote:

>I've been thinking about how to more effectively deal with the flood of 
>spam I get, and it seems to me that SpamAssassin's yes-or-no judgment is a 
>rather crude mechanism, and a triage approach would be better. I mean:
>
>   Some messages are definitely spam. Send them straight to /dev/null.
>
>   Some messages are definitely not spam. Send them to the Inbox.
>
>   Some messages might be spam. Send them to the maybe-spam folder.

I used to be a fan of SpamAssassin.  It has been missing too many lately as 
the spammers figure out how to get around it.  I'm waiting for someone to 
incorporate a spellcheck routine that bounces any mail with greater than x 
number of misspellings.  Spammers all seem to be functionally illiterate.


>This way, my inbox would be (cross fingers) free of spam, and the number 
>of possible spam messages would be kept to a manageable level. Up until 
>now I've been sending messages to a spam mailbox based on 'X-Spam-Status : 
>Yes' ... but it's just getting ridiculous. I *think* it's been a long time 
>since I've had a false positive, but with the number of messages going to 
>my spam box, it's impossible to check thoroughly, and I don't do it often 
>enough--I might as well just be discarding them all.
>
>It seems like between the SA scores and procmail and the message headers, 
>it should be possible to implement this approach. Has anybody done this? 
>For deciding that a message is definitely spam, do you think the scores 
>are enough? If so, where would you set the threshold?
>
>Or would you rather use keywords (e.g. I strongly doubt that a legitimate 
>sender will ever send me e-mail about my pen1s)? Or a combination?

keywords are a non starter.  I once did a spam workshop and sent an 
overview to the program director of the org that was sponsoring it.  My 
overview of the workshop failed keyword filters and got bounced as 
spam.  Also, avoid challenge/response systems.

Best option, IMO, is to boycott the sources of spam. My method to do that 
is in my sendmail config file. SORBS and Spamhaus do the best job of 
catching things:

FEATURE(`enhdnsbl', `dnsbl.ahbl.org', `"AHBL refused - see 
http://www.ahbl.org/tools/lookup.php?ip="$&{client_addr}""')dnl
FEATURE(`enhdnsbl', `sbl-xbl.spamhaus.org', `"Spamhaus refused - see 
http://www.spamhaus.org/query/bl?ip="$&{client_addr}""')dnl
FEATURE(`enhdnsbl', `l1.spews.dnsbl.sorbs.net', `"SPEWS refused - see 
http://www.spews.org/ask.cgi?x="$&{client_addr}""')dnl
FEATURE(`enhdnsbl', `relays.ordb.org', `"ORDB refused - see 
http://www.ordb.org/lookup/?host="$&{client_addr}""')dnl
FEATURE(`enhdnsbl', `dnsbl.sorbs.net', `"SORBS refused - see 
http://www.dnsbl.us.sorbs.net/cgi-bin/lookup?IP="$&{client_addr}""')dnl
FEATURE(`enhdnsbl', `dnsbl.njabl.org', `"NJABL refused - see 
http://njabl.org/cgi-bin/lookup.cgi?query="$&{client_addr}""')dnl
FEATURE(`enhdnsbl', `abuse.rfc-ignorant.org', `"RFC2142 refused - see 
http://rfc-ifgnorant.org/tools/lookup.php?domain="$&{client_addr}""')dnl
FEATURE(`enhdnsbl', `dsn.rfc-ignorant.org', `"RFC Ignorant - see 
http://rfc-ifgnorant.org/tools/lookup.php?domain="$&{client_addr}""')dnl
FEATURE(`enhdnsbl', `postmaster.rfc-ignorant.org', `"RFC2822 refused - see 
http://rfc-ifgnorant.org/tools/lookup.php?domain="$&{client_addr}""')dnl
FEATURE(`enhdnsbl', `bl.spamcop.net', `"spamcop blocked - see 
http://spamcop.net/bl.shtml?"$&{client_addr}')dnl
FEATURE(`enhdnsbl', `argentina.blackholes.us', `Argentina blocked - see 
http://www.spamhaus.org/sbl/isp_list.lasso?country=Argentina')dnl
FEATURE(`enhdnsbl', `brazil.blackholes.us', `BR blocked - see 
http://www.spamhaus.org/sbl/isp_list.lasso?country=Brazil')dnl
FEATURE(`enhdnsbl', `taiwan.blackholes.us', `TW blocked - see 
http://www.spamhaus.org/sbl/isp_list.lasso?country=Taiwan')dnl
FEATURE(`enhdnsbl', `china.blackholes.us', `CN blocked - see 
http://www.spamhaus.org/sbl/isp_list.lasso?country=China')dnl
FEATURE(`enhdnsbl', `korea.blackholes.us', `KR blocked - see 
http://www.spamhaus.org/sbl/isp_list.lasso?country=Korea')dnl


coupled with my access.db, that includes the cartooney.org listing.


>Last but not least, anybody have procmail rules they'd like to share?

I'm down to blocking whole countries, unfortunately. This is from an ISP 
where I don't have control of the sendmail config:

#  argentina/brazil expanded to 200/8 on 08.09.03 200/7 on 12.05.04
:0
* ^Received.*20[01]\.[0-9]*\.[0-9]*\.[0-9]
{
     EXITCODE=77
     LOG = "20[01]/8 - "
     :0
       /dev/null
}

# apnic
:0
* ^Received.*203\.[0-9]*\.[0-9]*\.[0-9]
{
    EXITCODE=77
    LOG = "203/8 - "
    :0
     /dev/null
}




-- 
coriez at oriez.org 39  34' 34.4"N / 105 00' 06.3"W       AIM handle caoriez
"Si Hoc Legere Scis Nimium Eruditionis Habes" 




More information about the clue-talk mailing list