Spamassassin Custom Rulesets
I’m starting this post in part as a placeholder for some information I’ve come across. I’ve been tinkering with my first custom spamassassin rule. I’ve tried the SARE rulesets and they seem to be missing one specific class of junk mail in my setup. (After verifying that the rulesets were actually being used), I set about trying to create my own rule to deal with the offending messages.
I found that it is amazingly simple at first blush to get a new rule going.
vi myrule.cf
header MY_LOCAL_SUBJECT_TEST Subject =~ /obviousspamword/i
score MY_LOCAL_SUBJECT_TEST 5.0
The above basically looks in the header of the message and if subject contains obviousspamword (\i makes it case insensitive) then it adds 5 to the score. Among the warnings here…. the way I’ve set this up, if “obviousspamword” is part of a “good” email word, then I’m in trouble… If I want to make sure that word breaks are observed and I only match obviousspamword and not obviousspamwordoftheday, then I need to have Subject =~/\bobviousspamword\b/i
It’s possible to do a body search the same way
body MY_LOCAL_BODY_TEST /obviousspamword/i
or to draw from the sa rules howto
n regular expressions a \b can be used to indicate where a word-break (anything that isn’t
an alphanumeric character or underscore) must exist for a match. Our rule above can be
made to not match “testing” or “attest” like so:body LOCAL_DEMONSTRATION_RULE /\btest\b/
The rule can also be made case-insensitive by adding an i to the end, like this:
body LOCAL_DEMONSTRATION_RULE /\btest\b/i
score LOCAL_DEMONSTRATION_RULE 0.1Now the rule will match any combination of upper or lower case that spells “test” surrounded
by word breaks of some form.
That’s all well and good but what if we need to do a fancier matching of terms. Usually just one word isn’t enough. The matches used above use perl regex syntax and more detailed examples of regexs can be found at the perldoc site.
Other examples can be found at www.exit0.us custom body tests and Rules basics at the exit0.us wiki
The first site you should look at IF you want to tweak spamassassin with new rulesets is rulesemporium.com. There are many good and useful sets there.
Here are a few other suggestions that I’ve come across for building a custom ruleset. Use lots of little rules to add small numbers of points instead of one big rule. Think of ALL the possible ways something MIGHT match (am I killing good mail with the bad.) Make some rules that give a negative value to the spam score. (If you’re a furniture shop then messages with bed, couch, wood, etc. would lower the spam score.) Use an online corpus of known spam to test against. (Don’t try to feed the messages as new through a live mail system. There are other tools to test with…)
When you’ve made your rule, type spamassassin –lint -D to check that the rule is correctly designed (syntax).
Finally, be conservative in your testing of custom rules, don’t be too ambitious. If you can get rid (or even increase the score) of one class of junkmail at a time that should make for an improvement.