DSPAM with VExim and folder-based training using Courier IMAP

Spam is getting worse, so I’m trying out DSPAM. I’ve added it into my vexim set-up and although it’s early days, things are looking promising.

I run various accounts with IMAP/POP3. For the POP3 users, there are training aliases to forward incorrectly-classified mail to. For IMAP users, it would be much more convenient to have DSPAM learn from a folder instead – drag and drop is easier than click, click, type, click.

I found a bash script which does nearly the right thing (using Postfix). Here’s the tweaked version for VExim/Courier.

[snip bash]
#!/bin/sh
#
#script to check for spam files and feed them to dspam
#
USERFILE=/tmp/dspam.users
DSPAM=/usr/bin/dspam
MYSQL=/usr/bin/mysql
SPAM_USER_FOLDER=/Maildir/.Junk
VIRTUAL_BASE=/var/spool/mail/
DBUSER=vexim
DBPASS=INSERT_YOUR_PASSWORD_HERE
DB=vexim
DELETE=YES

echo `date` Begin AutoSpam processing

# this query grabs all the maildirs for dspam users
$MYSQL -u $DBUSER -p$DBPASS -e \
“select users.localpart, domains.domain from users \
left join domains on domains.domain_id=users.domain_id \
where users.type=’local’ and users.enabled = 1″ \
–skip-column-names $DB > $USERFILE

while read LOCALPART DOMAIN
do
echo `date` Processing $LOCALPART in $VIRTUAL_BASE$DOMAIN

# check if the user has a .Junk folder
if [ -d $VIRTUAL_BASE$DOMAIN/$LOCALPART$SPAM_USER_FOLDER ]; then
# check both new and cur directories for spam
cd $VIRTUAL_BASE$DOMAIN/$LOCALPART$SPAM_USER_FOLDER/new
for j in *
do
# check if the file exists
if [ -s $j ]; then
# check if file was already identified as SPAM
grep “X-DSPAM-Result: Spam” $j 1>/dev/null
RESULT=$?
if [ $RESULT = 0 ]; then
echo “already processed as spam!”
# if wanted, delete the mail.
if [ $DELETE = “YES” ]; then
rm -f $j
fi
else
echo `date` Processing `pwd`/$j as spam
$DSPAM –user $LOCALPART@$DOMAIN –class=spam –source=error < $j
# if wanted, delete the mail.
if [ $DELETE = "YES" ]; then
rm -f $j
fi
fi
fi
done
cd $VIRTUAL_BASE$DOMAIN/$LOCALPART$SPAM_USER_FOLDER/cur
for j in *
do
# check if the file exists
if [ -s $j ]; then
# check if file was already identified as SPAM
grep "X-DSPAM-Result: Spam" $j 1>/dev/null
RESULT=$?
if [ $RESULT = 0 ]; then
echo “already processed as spam!”
# if wanted, delete the mail.
if [ $DELETE = “YES” ]; then
rm -f $j
fi
else
echo `date` Processing `pwd`/$j as spam
$DSPAM –user $LOCALPART@$DOMAIN –class=spam –source=error < $j
# if wanted, delete the mail.
if [ $DELETE = “YES” ]; then
rm -f $j
fi
fi
fi
done
fi
done < $USERFILE

# clean up action..
rm -f $USERFILE

echo `date` End AutoSpam processing
echo
[/snip]
Just get your users to create a folder called “Junk” and drop false negatives in there.

Word of warning: If you leave DELETE=YES, the script will purge the folder when it’s done. Thunderbird has a setting to move e-mails it thinks are spam to the “Junk” folder automatically. This might therefore eat e-mails without users seeing them. Change “Junk” in the script to “UncaughtSpam” or something similarly unlikely to clash with users’ existing folders if you’re worried about this.

One thought on “DSPAM with VExim and folder-based training using Courier IMAP”