Could we help you? Please click the banners. We are young and desperately need the money
Managing a mail server can be challenging, especially when you need to analyze email patterns or track specific communications. While Postfix provides robust logging capabilities, extracting meaningful information from these logs often requires complex parsing and processing. In this guide, we'll explore a powerful script that combines Bash and Python to extract and format mail log data, including email subjects, into easily analyzable formats.
Email log analysis is crucial for various scenarios. System administrators and IT professionals frequently need to analyze mail logs for troubleshooting delivery issues, monitoring email patterns and volume, auditing email communications, investigating security incidents, and generating usage reports.
The default Postfix configuration doesn't include subject lines in logs, which can make these tasks more challenging. Our solution addresses this limitation while providing flexible export options that make data analysis straightforward and efficient.
Before implementing the script, it's important to ensure your system meets all necessary requirements. Let's walk through each component needed for successful implementation.
Your server needs to have these core components installed and properly configured:
One of the advantages of our script is that it relies entirely on Python's standard library modules, requiring no additional package installation. The script utilizes these built-in modules:
To enable subject line logging in Postfix, you'll need to make some configuration changes. Follow these steps carefully:
1. First, create or edit the header checks configuration file:
sudo mkdir -p /etc/postfix/maps/
sudo nano /etc/postfix/maps/header_checks.pcre
2. Add this content to enable subject logging:
/^Subject:/ WARN
3. Configure Postfix to use the header checks by editing the main configuration:
sudo nano /etc/postfix/main.cf
4. Add or modify this line in main.cf:
header_checks = pcre:/etc/postfix/maps/header_checks.pcre
5. Apply the changes by reloading Postfix:
sudo postfix reload
Below is the complete implementation of our Postfix mail log export script. The script combines Bash for log processing and Python for robust subject line decoding:
#!/bin/bash
# ===========================================
# Postfix Mail Log Export Script
# ===========================================
# This script processes Postfix mail logs and exports email information
# into either markdown tables or CSV format. It supports processing of
# both current and historical (rotated) log files, with options to
# handle compressed archives.
# ===========================================
# Configuration Variables
# ===========================================
# Base path for mail log files. Rotated logs will be searched as:
# LOG_PATH.1, LOG_PATH.2.gz, etc.
LOG_PATH="/var/log/mail.log"
# Character used to wrap subject text in markdown output.
# This helps prevent conflicts with markdown table separators
# and makes the output more reliable when copy/pasting.
SUBJECT_SEP=""
# When true, subjects starting with '=' will be prefixed with
# a single quote. This prevents Excel/OpenOffice from interpreting
# the subject as a formula when importing the data.
QUOTE_EQUAL=true
# ===========================================
# Help Function
# ===========================================
show_usage() {
# Display script usage and available command line options
echo "Usage: $0 -r <include_rejected> -f <format> -c <case_sensitive> -n <history_count>"
echo "Options:"
echo " -r: Include rejected emails (true/false)"
echo " -f: Output format (markdown/csv)"
echo " -c: Case sensitive email matching (true/false)"
echo " -n: Number of history files to process (number or 'a' for all)"
exit 1
}
# ===========================================
# Command Line Argument Processing
# ===========================================
# Parse command line options using getopts
while getopts "r:f:c:n:h" opt; do
case $opt in
r) include_rejected=$OPTARG ;; # Control inclusion of rejected mails
f) output_format=$OPTARG ;; # Output format selection
c) case_sensitive=$OPTARG ;; # Email matching case sensitivity
n) history_count=$OPTARG ;; # Number of historic files to process
h) show_usage ;; # Show help
*) show_usage ;; # Invalid option
esac
done
# Verify all required arguments are provided
if [[ -z $include_rejected || -z $output_format || -z $case_sensitive || -z $history_count ]]; then
echo -e "\n\nERROR: Missing parameters\n\n"
show_usage
fi
# Convert output format to uppercase/lowercase for consistent comparison
include_rejected=${include_rejected,,}
output_format=${output_format^^}
case_sensitive=${case_sensitive,,}
# Validate the various formats...
if [[ $output_format != "MARKDOWN" && $output_format != "CSV" ]]; then
echo -e "\n\nERROR: Output format must be either 'markdown' or 'csv' (case insensitive)\n\n"
show_usage
fi
if [[ $include_rejected != "true" && $include_rejected != "false" ]]; then
echo "\n\nERROR: Include rejected must be either 'true' or 'false'\n\n"
show_usage
fi
if [[ $case_sensitive != "true" && $case_sensitive != "false" ]]; then
echo "\n\nERROR: Case sensitive must be either 'true' or 'false'\n\n"
show_usage
fi
# ===========================================
# Log File Discovery
# ===========================================
# Count available history files by incrementing counter
# until no more log files (plain or compressed) are found
max_history=1
while [ -f "${LOG_PATH}.${max_history}.gz" ] || [ -f "${LOG_PATH}.${max_history}" ]; do
((max_history++))
done
((max_history--))
echo -e "\n\nNOTE: Found $max_history [ ${LOG_PATH} ] history files...\n\n"
# Handle history file count selection
if [[ $history_count == "a" ]]; then
# Process all available history files
history_count=$max_history
elif [[ $history_count -gt $max_history ]]; then
# Adjust if requested count exceeds available files
echo "Warning: Only $max_history history files available. Using that instead."
history_count=$max_history
fi
# ===========================================
# Email Pattern Configuration
# ===========================================
# Prompt for and process email patterns
echo "╔═══════════════════════════════════════════════════╗"
echo "║ Enter recipient email patterns (space-separated): ║"
echo "║ Examples: ║"
echo "║ - user@domain.tld ║"
echo "║ - *@domain.tld (all recipients at domain) ║"
echo "║ - user@* (specific user at any domain) ║"
echo "╚═══════════════════════════════════════════════════╝"
echo ""
echo -n "Recipient(s): "
read -r recipients
# Convert email patterns into grep-compatible regex
email_patterns=""
for recipient in $recipients; do
# Add separator between multiple patterns
if [[ -n $email_patterns ]]; then
email_patterns+="|"
fi
# Convert wildcards to regex patterns and escape dots
recipient=$(echo "$recipient" | sed 's/\./\\./g' | sed 's/\*/.\*/g')
email_patterns+="($recipient)"
done
# ===========================================
# Subject Decoder Setup
# ===========================================
# Create temporary Python script for decoding email subjects
# This handles various character encodings and MIME formats
cat > /tmp/decode_subject.py << 'EOF'
import sys
import email.header
import quopri
import base64
import re
import codecs
def decode_subject(subject):
try:
subject = subject.replace('?==?', '?= =?')
parts = email.header.decode_header(subject)
decoded_parts = []
for part, charset in parts:
if isinstance(part, bytes):
if charset:
charset_map = {
'Windows-1252': 'cp1252',
'iso-8859-1': 'latin1',
'iso-8859-2': 'latin2',
'iso-8859-15': 'latin9',
'ks_c_5601-1987': 'cp949',
'GB2312': 'gb2312',
'big5': 'big5',
'shift_jis': 'cp932',
'euc-jp': 'euc_jp',
'koi8-r': 'koi8_r'
}
charset = charset_map.get(charset.lower(), charset)
decoded_parts.append(part.decode(charset, errors='replace'))
else:
decoded_parts.append(part.decode('utf-8', errors='replace'))
else:
decoded_parts.append(part)
result = ''.join(decoded_parts)
return result.strip()
except Exception as e:
return subject
if __name__ == '__main__':
if len(sys.argv) > 1:
print(decode_subject(sys.argv[1]))
EOF
# ===========================================
# Output Header Generation
# ===========================================
# Print appropriate header based on chosen format
echo -e "\n\nNOTE: Starting [ ${output_format} ] log extraction process...\n\n"
if [[ $output_format == "MARKDOWN" ]]; then
printf "| DateTime | Subject | From | To |\n"
printf "|----------|---------|------|----|\n"
elif [[ $output_format == "CSV" ]]; then
printf "DateTime,Subject,From,To\n"
fi
# ===========================================
# Log Processing
# ===========================================
# Process each log file, starting with oldest
for ((i=history_count; i>=0; i--)); do
# Determine current file path
current_file="${LOG_PATH}.${i}"
if [[ $i == 0 ]]; then
current_file="${LOG_PATH}" # Handle current log file
fi
# Choose appropriate cat command based on compression
if [[ -f "${current_file}.gz" ]]; then
cat_cmd="zcat"
current_file="${current_file}.gz"
else
cat_cmd="cat"
fi
# Process file if it exists
if [[ -f $current_file ]]; then
# Configure grep command based on case sensitivity
grep_cmd="grep -E"
if [[ $case_sensitive == "false" ]]; then
grep_cmd="grep -iE"
fi
# Configure rejection filter
reject_filter=""
if [[ $include_rejected == "false" ]]; then
reject_filter='| grep -v ": reject: "'
fi
# Process each matching line in the log file
eval "$cat_cmd '$current_file' | $grep_cmd 'from=<.*?>\\sto=<($email_patterns)>' $reject_filter" | while IFS= read -r line; do
# Extract datetime from log line
datetime=$(echo "$line" | cut -c1-15)
# Extract and decode subject
raw_subject=$(echo "$line" | grep -oP 'Subject: \K.*?(?= from [^ ]+\[|$)' | tr -d '\n')
# Decode MIME-encoded subjects
if [[ $raw_subject == *"=?"* ]]; then
decoded_subject=$(python3 /tmp/decode_subject.py "$raw_subject")
[ -z "$decoded_subject" ] && decoded_subject=$raw_subject
else
decoded_subject=$raw_subject
fi
# Handle subjects starting with equals sign
if [[ $QUOTE_EQUAL == true && $decoded_subject == "="* ]]; then
decoded_subject="'$decoded_subject"
fi
# Extract email addresses
from_email=$(echo "$line" | grep -oP 'from=<\K[^>]+')
to_email=$(echo "$line" | grep -oP 'to=<\K[^>]+')
# Output in selected format
if [[ $output_format == "MARKDOWN" ]]; then
# Escape markdown table separators in subject
decoded_subject=$(echo "$decoded_subject" | sed "s/|/\\|/g" | sed "s/\`/\\\`/g")
printf "| %s | %s%s%s | %s | %s |\n" \
"$datetime" \
"$SUBJECT_SEP" \
"$decoded_subject" \
"$SUBJECT_SEP" \
"$from_email" \
"$to_email"
elif [[ $output_format == "CSV" ]]; then
# Escape quotes in subject for CSV
decoded_subject=$(echo "$decoded_subject" | sed 's/"/""/'g)
printf "%s,\"%s\",%s,%s\n" \
"$datetime" \
"$decoded_subject" \
"$from_email" \
"$to_email"
fi
done
fi
done
# Cleanup temporary files
rm /tmp/decode_subject.py
Save this script as postfix-mail-log-export.sh
and make it executable:
chmod +x postfix-mail-log-export.sh
Our script offers a comprehensive set of features designed to make mail log analysis both powerful and flexible. Let's explore each major capability in detail.
The script provides two output format options, each serving different needs:
The pattern matching system supports sophisticated filtering options:
The script excels at handling complex email subjects:
Feature | This Script | Manual Grep | Logwatch |
---|---|---|---|
Subject Line Support | Yes | Limited | No |
MIME Decoding | Yes | No | No |
Multiple Output Formats | Yes | No | Limited |
Pattern Matching | Advanced | Basic | Basic |
Let's explore some practical examples of using the script in different scenarios.
Here's a basic command that covers most common use cases:
./postfix-mail-log-export.sh -r false -f csv -c true -n 1
This command configuration:
The script supports sophisticated email pattern matching that can handle various use cases:
user@domain.tld
- Exact match for specific addresses*@domain.tld
- Matches all recipients at a specific domainuser@*
- Tracks a specific user across all domainsMultiple patterns can be combined by separating them with spaces when entering them at the prompt.
While the script is designed to be robust, you might encounter some common issues. Here's how to address them:
If you encounter permission-related problems:
For character encoding-related issues:
To optimize performance when working with large log files:
To ensure optimal performance and reliability, consider these maintenance aspects:
Implement these maintenance practices:
Keep the script current by:
This Postfix mail log export script represents a powerful solution for email log analysis, combining the flexibility of Bash with Python's robust text processing capabilities. Whether you're troubleshooting mail delivery issues or conducting security audits, this tool simplifies the process of extracting and analyzing email log data.
Remember to properly configure Postfix for subject logging and ensure all dependencies are met before implementing the script. With its flexible output options and powerful pattern matching capabilities, this script can significantly streamline your mail server administration tasks.
The combination of easy setup, powerful features, and flexible output options makes this script an invaluable tool for any system administrator working with Postfix mail servers. By following the installation and configuration steps outlined in this guide, you'll be well-equipped to handle various mail log analysis tasks efficiently.