This script automates the process of teaching SpamAssassin to recognize spam and ham (non-spam) messages from users' mailboxes, while allowing for specific home directories to be excluded from processing using a skip list file.
-
Skip List File:
- The script reads a file named
skip_folders.txt, which should be placed in the same directory as the script. - The
skip_folders.txtfile contains a list of home folder names (one per line) that should be excluded from processing. - If no skip list file is found, the script will process all home directories.
- The script reads a file named
-
Spam and Ham Classification:
- For each user's home directory (except those listed in the skip list), the script processes two mail folders:
Maildir/.Junk: Messages in this folder are learned as spam usingsa-learn --spam.Maildir/cur: Messages in this folder are learned as ham (non-spam) usingsa-learn --ham.
- For each user's home directory (except those listed in the skip list), the script processes two mail folders:
-
Skipped Folders:
- Any home folder listed in the
skip_folders.txtfile is skipped during processing. - After processing, the script outputs a list of skipped folders.
- Any home folder listed in the
-
Synchronization:
- After processing all folders, the script synchronizes the SpamAssassin Bayes database using
sa-learn --sync.
- After processing all folders, the script synchronizes the SpamAssassin Bayes database using
user3
lost+found
- SpamAssassin must be installed and configured on the system.
- Mail folders should be in the default
Maildirformat under each user's home directory.
- Clone the repository or download the script.
- Ensure the
skip_folders.txtfile is in the same directory as the script. - Make the script executable:
chmod +x script_name.sh
- Run the script as a superuser (root) to process all home directories:
sudo ./script_name.sh
- The script prints progress as it processes each folder, including spam/ham learning status.
- At the end of the run, it lists any folders that were skipped based on the skip_folders.txt file.
- The script also outputs when the SpamAssassin database is synchronized.
Entering /home/user1
Processing /home/user1/Maildir/.Junk
Learning messages as spam
Processing /home/user1/Maildir/cur
Learning messages as ham (non-spam)
Entering /home/user2
Processing /home/user2/Maildir/.Junk
Learning messages as spam
Processing /home/user2/Maildir/cur
Learning messages as ham (non-spam)
The following folders were skipped due to exclusion in /path/to/skip_folders.txt:
user3
lost+found
Synchronizing the database and the journal
Done :)