DSPAM with VExim and folder-based training using Courier IMAP

Posted December 28, 2006 by

Spam is getting worse, so I’m trying out DSPAM. I’ve added it into my vexim set-up and although it’s early days, things are looking promising.

I run various accounts with IMAP/POP3. For the POP3 users, there are training aliases to forward incorrectly-classified mail to. For IMAP users, it would be much more convenient to have DSPAM learn from a folder instead – drag and drop is easier than click, click, type, click.

I found a bash script which does nearly the right thing (using Postfix). Here’s the tweaked version for VExim/Courier.

[snip bash]
#!/bin/sh
#
#script to check for spam files and feed them to dspam
#
USERFILE=/tmp/dspam.users
DSPAM=/usr/bin/dspam
MYSQL=/usr/bin/mysql
SPAM_USER_FOLDER=/Maildir/.Junk
VIRTUAL_BASE=/var/spool/mail/
DBUSER=vexim
DBPASS=INSERT_YOUR_PASSWORD_HERE
DB=vexim
DELETE=YES

echo `date` Begin AutoSpam processing

# this query grabs all the maildirs for dspam users
$MYSQL -u $DBUSER -p$DBPASS -e \\
“select users.localpart, domains.domain from users \\
left join domains on domains.domain_id=users.domain_id \\
where users.type=’local’ and users.enabled = 1″ \\
–skip-column-names $DB > $USERFILE

while read LOCALPART DOMAIN
do
echo `date` Processing $LOCALPART in $VIRTUAL_BASE$DOMAIN

# check if the user has a .Junk folder
if [ -d $VIRTUAL_BASE$DOMAIN/$LOCALPART$SPAM_USER_FOLDER ]; then
# check both new and cur directories for spam
cd $VIRTUAL_BASE$DOMAIN/$LOCALPART$SPAM_USER_FOLDER/new
for j in *
do
# check if the file exists
if [ -s $j ]; then
# check if file was already identified as SPAM
grep “X-DSPAM-Result: Spam” $j 1>/dev/null
RESULT=$?
if [ $RESULT = 0 ]; then
echo “already processed as spam!”
# if wanted, delete the mail.
if [ $DELETE = "YES" ]; then
rm -f $j
fi
else
echo `date` Processing `pwd`/$j as spam
$DSPAM –user $LOCALPART@$DOMAIN –class=spam –source=error < $j
# if wanted, delete the mail.
if [ $DELETE = "YES" ]; then
rm -f $j
fi
fi
fi
done
cd $VIRTUAL_BASE$DOMAIN/$LOCALPART$SPAM_USER_FOLDER/cur
for j in *
do
# check if the file exists
if [ -s $j ]; then
# check if file was already identified as SPAM
grep "X-DSPAM-Result: Spam" $j 1>/dev/null
RESULT=$?
if [ $RESULT = 0 ]; then
echo “already processed as spam!”
# if wanted, delete the mail.
if [ $DELETE = "YES" ]; then
rm -f $j
fi
else
echo `date` Processing `pwd`/$j as spam
$DSPAM –user $LOCALPART@$DOMAIN –class=spam –source=error < $j
# if wanted, delete the mail.
if [ $DELETE = "YES" ]; then
rm -f $j
fi
fi
fi
done
fi
done < $USERFILE

# clean up action..
rm -f $USERFILE

echo `date` End AutoSpam processing
echo
[/snip]
Just get your users to create a folder called “Junk” and drop false negatives in there.

Word of warning: If you leave DELETE=YES, the script will purge the folder when it’s done. Thunderbird has a setting to move e-mails it thinks are spam to the “Junk” folder automatically. This might therefore eat e-mails without users seeing them. Change “Junk” in the script to “UncaughtSpam” or something similarly unlikely to clash with users’ existing folders if you’re worried about this.

Post Details

  • Post Title: DSPAM with VExim and folder-based training using Courier IMAP
  • Author: Alastair
  • Filed As: E-mail
  • Tags:
  • You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

One Opinion has been expressed on “DSPAM with VExim and folder-based training using Courier IMAP”. What is your opinion?

  1. Joćo Mesquita commented:

    Perfect! Thanks a lot, spam classification is ALWAYS a pain. :)

Leave a Reply