Menü schliessen
Created: November 12th 2024
Last updated: November 12th 2024
Categories: Cyber Security,  Linux
Author: Marcus Fleuti

SpamAssassin: Detect Spam by TO/CC Address Usage in Email Body with False Positive Prevention

Donation Section: Background
Monero Badge: QR-Code
Monero Badge: Logo Icon Donate with Monero Badge: Logo Text
82uymVXLkvVbB4c4JpTd1tYm1yj1cKPKR2wqmw3XF8YXKTmY7JrTriP4pVwp2EJYBnCFdXhLq4zfFA6ic7VAWCFX5wfQbCC

Introduction

Spam emails often include recipient addresses directly in their body text - a technique commonly used in phishing attempts and mass spam campaigns. This article shows you how to implement a SpamAssassin rule that detects such patterns and scores emails accordingly. We'll also cover an intelligent solution to prevent false positives when dealing with legitimate email replies.

Understanding the Spam Pattern

Spammers frequently include recipient email addresses in their message body for various reasons:

  • Creating a false sense of personalization
  • Attempting to make the email appear more legitimate
  • Poor mail merge implementations
  • Phishing attempts that reference the recipient's email address

Plugin Implementation

Copy/Paste the following plugin code and proceed as described in the Installation and Configuration section below:

package Mail::SpamAssassin::Plugin::CheckToCCAddress;
use strict;
use warnings;
use Mail::SpamAssassin::Plugin;
use vars qw(@ISA);
@ISA = qw(Mail::SpamAssassin::Plugin);

sub new {
    my ($class, $mailsa) = @_;
    $class = ref($class) || $class;
    my $self = $class->SUPER::new($mailsa);
    bless ($self, $class);
    $self->register_eval_rule('check_address_in_body');
    return $self;
}

sub check_address_in_body {
    my ($self, $pms, @header_names) = @_;
    
    my $body_ref = $pms->get_decoded_body_text_array();
    return 0 unless $body_ref;
    
    my $body = join("\n", @$body_ref);
    
    # First check if FROM address is in body (indicating a reply)
    my @from_addresses = $pms->get("From:addr");
    foreach my $from_addr (@from_addresses) {
        next unless $from_addr;
        $from_addr =~ s/^\s+|\s+$//g;
        if ($body =~ /\b\Q$from_addr\E\b/i) {
            # Found FROM address in body, likely a reply, skip check
            return 0;
        }
    }
    
    # If we get here, proceed with normal TO/CC check
    foreach my $header (@header_names) {
        my @addresses = $pms->get("${header}:addr");
        foreach my $addr (@addresses) {
            next unless $addr;
            $addr =~ s/^\s+|\s+$//g;
            if ($body =~ /\b\Q$addr\E\b/i) {
                return 1;
            }
        }
    }
    
    return 0;
}

Installation and Configuration

Create the plugin (file) and copy/paste the above plugin code into this file

Save the plugin code (below) into the following fileCheckToCCAddress.pm in your SpamAssassin plugin directory (typically):/usr/share/perl5/Mail/SpamAssassin/Plugin/or
/etc/spamassassin

SpamAssassin local.cf Configuration

Add the following lines to your /etc/spamassassin/local.cf file. You can use your preferred text editor like vim or nano:

# Load the CheckToCCAddress plugin
loadplugin Mail::SpamAssassin::Plugin::CheckToCCAddress CheckToCCAddress.pm

# Define the rule that checks for recipient addresses in body
header   TO_CC_IN_BODY  eval:check_address_in_body('To','Cc')
describe TO_CC_IN_BODY  Recipient address found in message body
score    TO_CC_IN_BODY  2.0

# Optional: Add tflags if you want this rule to be shown in report
# tflags   TO_CC_IN_BODY  learn

Testing the new plugin/rule

After adding these lines, verify the configuration:

  1. Check for syntax errors:
    spamassassin --lint
  2. If there's no errors, also check if the plugin has been loaded successfully:
    spamassassin -D --lint 2>&1 |grep CheckToCCAddress

    You should see a line lkike this:

    Nov 12 11:50:45.047 [688419] dbg: plugin: loading Mail::SpamAssassin::Plugin::CheckToCCAddress from /etc/spamassassin/CheckToCCAddress.pm
  3. If you're running SpamAssassin as a service, restart or reload it:
    sudo systemctl reload spamassassin

    or on older systems:

    sudo service spamassassin restart
  4. Test the rule with a sample email:
    spamassassin -D --test-mode < test_email.txt | grep TO_CC_IN_BODY

You can adjust the score value (2.0) based on your needs. Higher scores will be more aggressive in marking emails as spam when recipient addresses are found in the body. Some guidelines for scoring:

  • 1.0 - 2.0: Conservative scoring, good for initial testing
  • 2.0 - 3.0: Moderate scoring, recommended for most environments
  • 3.0+: Aggressive scoring, use only if you're seeing high accuracy with this rule

How we prevent False Positives

A common issue with TO/CC address detection is false positives from legitimate email replies that contain signatures. Our plugin implements a FROM check to handle this:

  1. Before checking for TO/CC addresses, the plugin examines if the sender's (FROM) address appears in the body too. Yes, this could lead for some spam e-mails to remain undetected. Although we noticed that most spam e-mails do not contain the FROM address in the e-mail body, since most spam is sent in bulk with ever changing FROM addresses.
  2. If the FROM address is found, it likely indicates a legitimate reply containing a signature
  3. In such cases, the check is skipped to prevent false positives

Solution Comparison

Feature This Plugin Basic Regex Match Quote Detection
Detects TO/CC in Body Yes Yes Yes
False Positive Prevention Yes No Limited
Reply Detection Smart None Basic
Performance Impact Low Very Low Medium

Testing and Fine-Tuning

After implementation, monitor your mail logs for the TO_CC_IN_BODY rule hits. You may need to adjust the score based on your environment:

  • A score of 2.0 is a moderate starting point
  • Increase the score if you see a high correlation with actual spam
  • Decrease if you notice any remaining false positives

Conclusion

This SpamAssassin rule provides an effective way to detect spam emails that use recipient addresses in their body while intelligently avoiding false positives from legitimate replies. The implementation is lightweight, efficient, and can be easily customized to suit your specific needs.

Understanding the Perl Module

The plugin is written in Perl (`.pm` stands for Perl Module) since SpamAssassin itself is a Perl-based application. Key Perl-specific elements in our code include:

  • package declaration: package Mail::SpamAssassin::Plugin::CheckToCCAddress; - defines the Perl namespace for our module
  • use strict; use warnings; - Perl pragmas that enforce better coding practices and provide helpful warnings
  • @ISA - Perl's inheritance mechanism (stands for "IS A"), used here to inherit from the base SpamAssassin plugin class
  • bless - Perl's object-orientation mechanism, used in the constructor
  • Regular expressions - Perl's powerful regex capabilities, used here for pattern matching with the operator:=~

If you're new to Perl, don't worry - the code is relatively straightforward. The most important parts to understand are:

  • The plugin inherits from SpamAssassin's base plugin class
  • It registers an eval function (check_address_in_body) that SpamAssassin can call
  • This function receives email header names and checks if those addresses appear in the body

A basic understanding of Perl is helpful but not required for using this plugin, though it would be necessary if you want to modify or extend its functionality.