UPDATED: April 27, 2010
v3.06 is released.
I've built a "Build Your Own SpamAssassin Trainer" web app that you can use that will ask a few simple questions and generate everything you'll need for the Perl script. I'll be making other changes to the Perl script this spring that should improve performance.
Documentation: please visit http://iandouglas.com/spamassassin-trainer/
Build your trainer script here: http://iandouglas.com/sa-trainer/
Using the /sa-trainer/ link will let you configure your script in a web page using some simple prompts, and build a .zip file for you.
DISCLAIMERS:
Disclaimer #1: Being the guy that wrote this script back in 2001 and have been hacking at it ever since, and posting it here in April 2004, this script works amazingly well for me. Your mileage, of course, may vary.
Disclaimer #2: LunarPages has given me permission to post this information and quick start guide with with the following notes:please include a warning that it is the user's own responsibility to mess with it :)
and
(paraphrased) Please announce that all LunarPages users should consider this message thread as the primary source of support for sa-trainer.cgi
I
fully intend to hang out here (since I'm also a LunarPages user myself) to support this script till the end of time, so I'm happy to comply.
Disclaimer #3: While I have tried
very hard to document this as carefully as I can and use 'best practice' software development efforts, some errors are bound to happen, so there are NO guarantees on these instructions whatsoever. However, numerous LunarPages users use my script on a regular basis and have seen dramatic drops in the amount of spam in their Inbox.
Disclaimer #4: The new script (starting at v3.02) and full supporting documentation is located at iandouglas.com (linked later within the quick-start guide for the download link, and linked again at the bottom of this message). LunarPages support staff have been
awesome about letting me move the script out of this message thread (since the script is too big to fit in a single message here now). Just be aware that
viewing the full documentation and downloading the script itself will take you away from LunarPages and LunarForums.com. Please
do return here for support if you are a LunarPages user -- I promised LP that I'd always be available to this forum for assistance.[/i]
THE NEW sa-trainer.cgi QUICK-START GUIDEHere are some very general instructions for how to set up SpamAssassin in CPanel and configuring the final details, downloading and installing the script, and getting it running. These instructions will teach you to do the following:
- create an Email account called globalham@yourdomain.com for your users to forward their non-spam messages to
- you will enable the CPanel "spam box" option, and scan each individual user's spam mailbox
Assumptions I Need to Make about YouIf you want to take the simplest approach, and use the default behavior of this script:
- that you know how to log in to CPanel using your LunarPages or other hosting account details
- that you know how to create a new mailbox for your primary domain in CPanel
- that you can save a copy of my script on your local computer, change it in a text editor like Notepad or TextEdit (not a word processor like MS Word), and save the file
- that you know how to use an FTP program to upload a copy of the script to your hosting account
- if you have multiple users with mailboxes through your account, that you can communicate effectively with your users to clean up their own mailboxes once you've finished running this training script
If you want to use more advanced features of this script:
- that you can do all of the above, and know how to search through the configuration settings within the script to make changes to suit your needs
- if you download your Email through a third-party software (Outlook, Outlook Express, Thunderbird, Eudora, etc) that you are familiar enough with that software to add an IMAP account or profile
- or, if you always use webmail such as Squirrelmail or Horde, that you are familiar enough with using the software to move or copy messages to other folders
Terminology You Need to LearnSPAM: unsolicited Emails that you've received that want you to buy something or contain adult-themed references that you'd rather not get anymore.
HAM: non-spam, legitimate Emails from friends, family, newsletters, and so on
SA: short for SpamAssassin
False-Positive: this is a non-spam (HAM) message that SA flagged as SPAM and ended up in our spam box.
False-Negative: this is a SPAM message that SA flagged as non-spam (HAM) that ended up in our Inbox
IMAP: this is an Email protocol used to send/receive Email messages from your hosting account. Generally, IMAP will leave a copy of downloaded messages on the server instead of downloading them to your computer and deleting the server's copy.
"spam/ham folder pair": this is a set of mail folders (which may actually be files instead of folders) that we will set up and use to store copies of messages to train SpamAssassin with.
primary domain: the first (or only) domain name configured for your CPanel account, not an add-on or parked domain added later.
NOTE: for all examples in the setup and the script itself, the account name I will use is
myaccount. The primary domain for my CPanel account is
mydomain.com. I'll do my best to keep these terms bolded throughout this text to highlight where you'll need to insert your own information.
Configure CPanel to turn SpamAssassin onLogin to your CPanel intferface, click on the 'Mail' icon, click on the link for 'SpamAssassin'.
Click on the 'Enable SpamAssassin' button, click on the 'go back' link.
Click on the 'Enable Spam Box', click on the 'go back' link.
Click on the 'go back' link again so you're back at the 'Mail' icon menu list where you clicked on 'SpamAssassin'
Click on 'Add/Remove/Manage Accounts'
Click on 'Add Account' link at the bottom
Set the Email account as 'globalham' at your primary domain name, set a password, and set a reasonable quota based on your usage, such as 100MB or 200MB. Click the 'Create' button
Click the 'Go Back' link
Create/Edit /home/myaccount/.spamassassin/user_prefsIn Cpanel, click on the File Manager icon
Click on the folder next to the ".spamassassin" folder link
If "user_prefs" doesn't already exist, click on the "Create New File" link, call the file "user_prefs" and specify that it is a Text Document, and click the Create button.
Click on the filename link for "user_prefs", and in the top-right corner of the screen, select to edit the file.
Replace the entire contents of the file with this text:
use_bayes 1
required_hits 3.5
rewrite_subject 1
subject_tag {SPAM _SCORE(0)_}
bayes_path /home/myaccount/.spamassassin/bayes
bayes_file_mode 0600
bayes_ignore_header X-MailScanner
bayes_ignore_header X-MailScanner-SpamCheck
bayes_ignore_header X-MailScanner-SpamScore
bayes_ignore_header X-MailScanner-Information
... be sure to replace "myaccount" with your actual CPanel username, and click the 'Save' button
Getting the sa-trainer.cgi ScriptBuild everything you need at
http://iandouglas.com/sa-trainer/This will take you away from LunarForums.com, but is the preferred method for getting a stable copy of the script. On that page, follow the instructions and download the .zip file it creates for you.
Upload sa-trainer.cgi to your hosting accountUse your favorite FTP program, upload it in ASCII mode into your /www/cgi-bin/ folder, and set the permission bits (chmod) to be 755. The script likely will not run without this.
*** I recommend renaming the script to some other name other than sa-trainer.cgi (like: my-spam-trainer.cgi or anything with a .cgi file extension) to avoid any security problems of people knowing you run this script in case any bugs are found that could be exploited (though I haven't found any myself, nor have any been reported to me, in the past three years).
If you do not have an FTP program, you can open the script in Notepad or TextEdit again, copy the entire contents to your clipboard, and do the following:
in CPanel, click on the File Manager icon
click on the yellow folder beside "public_html" or "www" (they both go to the same place)
click on the yellow folder beside "cgi-bin"
click on the link to "create new file"
In the top right corner of the screen, specify to create a new text document called "sa-trainer.cgi" (or some other filename to avoid any security issues) and click the Create button
In the new window that pops up, paste the contents of the script into the space provided, and click the 'save' button at the bottom, then close the pop-up window
Back in the File Manager window, click on the filename (which is a link) f or sa-trainer.cgi
Click on the link to set the permissions of the script, and select the 'execute' bit for all 3 columns so the permissions number reads '755' and click the 'change' button.
Have some recent spam/ham available to train withOnce you have some spam and ham messages available in the mailboxes you configured, simply call your script in your web browser, like "
http://www.
mydomain.com/cgi-bin/sa-trainer.cgi" (or whatever you called your copy of the script).
Ongoing maintenance1. Teach your users to forward non-spam messages to globalham@
mydomain.com, with a disclaimer that no human eyes will ever see the mailbox (you could be found liable for reading their private messages, so be sure you're not secretly peeking in there...). Instruct them
not to forward messages over 100kb or with file attachments, as these can confuse SpamAssassin and slow down the scanning.
2. Once scanning is complete, empty the Inbox for the
globalham@yourdomain.com account - the easiest and quickest way to avoid any legal/privacy concerns would be to completely delete the mailbox from CPanel and rebuild it.
3. You will also need to instruct your users to empty their spam boxes once scanning is complete. To do this, they can highlight/select all of their spam messages in the 'spam' folder, and use the delete function of their webmail/Email client software.
Did I forget anything?Be sure to notify me if I've neglected to describe any step along the way.
Full DocumentationA MUCH larger version of this documentation is available at
http://iandouglas.com/spamassassin-trainer/ You will probably need an hour or more to read through it (told you it was huge), but it goes much deeper into the configurable options of the script.
And as stated in a few other places here:
if you are a LunarPages customer, this forum message thread that you're reading right now is your primary means of support for this script so please post messages here if you have questions or problems with the script.
Like it? Love it? Need extra help?You can PayPal me any amount up to $20 for assistance and customization.
Good luck, and happy spam fighting!