Web Hosting Forum | Lunarpages
News: July 14, 2008 - New Contest! - Submit Your WordPress Theme Designs, Win BIG!
August 5, 2008 - Time to Submit Your Links for the August 08 Site of the Month Award!
 
*
Welcome, Guest. Please login or register.
Did you miss your activation email?
August 21, 2008, 09:06:27 AM


Login with username, password and session length


Pages: [1] 2   Go Down
  Print  
Author Topic: How-to: Train SpamAssassin  (Read 9259 times)
Danielle
Resident Alien
Administrator
Berserker Poster
*****
Offline Offline

Posts: 8900


nihil sunt omnia


WWW
« on: April 08, 2004, 02:29:27 PM »

Many Thanks to w98 (i.e., id) for doing this how-to on SA training which can be found at the following location:

http://www.lunarforums.com/forum/index.php?topic=13958.0

Please note the posts that follow in this thread involved an older copy of the how-to, so please instead post all messages in the above thread after reviewing the how-to there.

Thanks  Very Happy
« Last Edit: August 18, 2005, 08:42:42 AM by Danielle » Logged

Danielle Wallace
- nihil sunt omnia -
Lunarpages Webhosting ~ Lunarpages Forums ~ Lunarpages Affiliates
Administrator Training Manager - System Administrator Team


Ruby Asylum - For those crazy about Ruby
A&E Writing Forum ~ Best Garden ~ Endar & Endar Gallery ~ RatingBar.com

Every living creature on this earth dies alone.
w98
Galactic Royalty
*****
Offline Offline

Posts: 438



WWW
« Reply #1 on: April 09, 2004, 03:13:36 PM »

Hi Danielle,

Any chance you could simply provide a link to the other thread? Lopht and I have made some changes to the documentation and the script itself, so that would be a better place to send the users, perhaps?

Thanks,
Ian Douglas, aka "id", aka "w98"  Thumbs Up
Logged

Rocknrob
Spacescooter Operator
*****
Offline Offline

Posts: 42



WWW
« Reply #2 on: April 09, 2004, 08:10:58 PM »

Spaceship captain, help!
On step 3. Where do I exactly put this inside of the folder?
required_hits 5
rewrite_subject 1
subject_tag {SPAM}
bayes_path /home/ lpaccount /.spamassassin/bayes
bayes_file_mode 0600
bayes_ignore_header X-MailScanner
bayes_ignore_header X-MailScanner-SpamCheck
bayes_ignore_header X-MailScanner-SpamScore
bayes_ignore_header X-MailScanner-Information
I am nervous at this piont, should it have # in front of them or no?
Thanks,
Logged

w98
Galactic Royalty
*****
Offline Offline

Posts: 438



WWW
« Reply #3 on: April 09, 2004, 09:40:13 PM »

Code:
required_hits 5
rewrite_subject 1
subject_tag {SPAM}
bayes_path /home/lpaccount/.spamassassin/bayes
bayes_file_mode 0600
bayes_ignore_header X-MailScanner
bayes_ignore_header X-MailScanner-SpamCheck
bayes_ignore_header X-MailScanner-SpamScore
bayes_ignore_header X-MailScanner-Information


all of that goes in your /home/lpaccount/.spamassassin/user_prefs file, assuming your LP account name is "lpaccount", of course.

The lines that being with a # are just commented lines, you don't need those.

Essentially, here's a line-by-line description of what each portion of that example user_prefs file does:

required_hits 5
This tells SA that anything that scores higher than 5.0 points should be flagged as SPAM

rewrite_subject 1
This tells SA to rewrite the start of your subject line if it scores higher than what the score was from the previous setting.

subject_tag {SPAM}
This tells SA what to prepend the string "{SPAM}" to the start of any Emails that score higher than "required_hits".

bayes_path /home/lpaccount/.spamassassin/bayes
This is the start of the path for your bayesian database including the start of the filename. If this path ended in "/ianwashere" then it would look for files like "ianwashere_toks" and "ianwashere_seen" etc.

bayes_file_mode 0600
How to set permissions on your bayesian database files

bayes_ignore_header X-MailScanner
bayes_ignore_header X-MailScanner-SpamCheck
bayes_ignore_header X-MailScanner-SpamScore
bayes_ignore_header X-MailScanner-Information

These lines basically tell SpamAssassin to skip scanning these header strings for content that could be spammy. This is generally a good idea to have here.

-id
Logged

Rocknrob
Spacescooter Operator
*****
Offline Offline

Posts: 42



WWW
« Reply #4 on: April 09, 2004, 09:43:49 PM »

That didn't really help, I need to know where exactly within the code, do i put it.
Code:
# SpamAssassin user preferences file.  See 'perldoc Mail::SpamAssassin::Conf'
# for details of what can be tweaked.
###########################################################################

# How many hits before a mail is considered spam.
#required_hits 5
# Whitelist and blacklist addresses are now file-glob-style patterns, so
# "friend@somewhere.com", "*@isp.com", or "*.domain.net" will all work.
# whitelist_from someone@somewhere.com

# Add your own customised scores for some tests below.  The default scores are
# read from the installed spamassassin rules files, but you can override them
# here.  To see the list of tests and their default scores, go to
# http://spamassassin.org/tests.html .
#
# score SYMBOLIC_TEST_NAME n.nn

# Speakers of Asian languages, like Chinese, Japanese and Korean, will almost
# definitely want to uncomment the following lines.  They will switch off some
# rules that detect 8-bit characters, which commonly trigger on mails using CJK
# character sets, or that assume a western-style charset is in use.
#
# score HEADER_8BITS 0
# score HTML_COMMENT_8BITS 0
# score SUBJ_FULL_OF_8BITS 0
# score UPPERCASE_25_50 0
# score UPPERCASE_50_75 0
# score UPPERCASE_75_100 0



Where inside of there?
Logged

w98
Galactic Royalty
*****
Offline Offline

Posts: 438



WWW
« Reply #5 on: April 09, 2004, 09:54:55 PM »

At the very end of the file, or just simply erase everything in there and replace it with my example since my example is pretty "factory default" anyways other than the path to your bayesian database files.
Logged

leighsww
* The Tough Love Cuddly One *
Berserker Poster
*****
Offline Offline

Posts: 14072


WWW
« Reply #6 on: April 17, 2004, 04:38:57 PM »

FANTASTIC, w98!!  Applause

I can't believe you documented all that!!  Must have took you many sleepless nights!

Excellent and THANKS for this amazing contribution!  Thumbs Up
Logged
Tracie
MR-Disabled
Master Jedi
*
Offline Offline

Posts: 1443


« Reply #7 on: April 17, 2004, 05:01:25 PM »

Excellent information!

Thanks w98!
Logged
w98
Galactic Royalty
*****
Offline Offline

Posts: 438



WWW
« Reply #8 on: April 17, 2004, 09:30:03 PM »

Actually, it only took about a day or two to write up, and tested the documentation on my second account.

Next step will be to add documentation and screen shots of copying messages to/from Outlook.

Glad it's helping some of you out there. Cool
Logged

Mart
Pong! (the videogame) Master
*****
Offline Offline

Posts: 25


« Reply #9 on: April 18, 2004, 08:04:33 AM »

Just like to add my thanks to the list, I have it all set-up and seems to be working nicely.  Now to see how SA responds to my training Smile.
Logged
w98
Galactic Royalty
*****
Offline Offline

Posts: 438



WWW
« Reply #10 on: April 19, 2004, 12:18:14 AM »

I've had it running since a day or so before posting my ideas, and almost everything ending up in my spam mailboxes is getting there with a BAYES_99 header, meaning that SpamAssassin is 99%-100% confident the message is spam.

Of course, I get a LOT of spam at my domains, and having a catch-all address set up means I catch all that much more since some spammers have misspelled my Email address (ie: used to be "info@wild98.com" and got misspelled to "o@wild98.com", or used to be "ian@wild98.com" and is now "an@wild98.com") so I'm catching alllll kinds of spam.

Glad it's helping though, and glad everyone's getting use out of it. I have a few more things I'll be adding to the topic this week.
Logged

Tatami
Newbie
*
Offline Offline

Posts: 2


« Reply #11 on: April 27, 2004, 12:42:39 AM »

Hello,

Sorry, I may have missed a step: when calling http://www.mydomain.com/cgi-bin/sa-learn.cgi, I get a server error msg (500).

Also, filtering doen't appear to work when I spam myself...

Execute permission are Ok for the file, and paths should be OK. SA spam/spambox options enabled, and top level myspam/myahm boxes created...

==========================
!/usr/bin/perl

my $salearn = "/usr/bin/sa-learn" ;
$| ;

print "Content-type: text/plain\n\n" ;

print "Learning SPAM:\n" ;
print `$salearn -p /home/lpaccount/.spamassassin/user_prefs --mbox --spam --showdots /home/lpaccount/mail/myspam` ;
print "\n\n" ;

[ etc...]
==========================

Thanks for yor input...
Logged
w98
Galactic Royalty
*****
Offline Offline

Posts: 438



WWW
« Reply #12 on: April 27, 2004, 08:12:24 AM »

The first line must be
Code:
#!/usr/bin/perl

Looks like you were missing the # at the start of the first line.
Logged

kwdavids
Galactic Royalty
*****
Offline Offline

Posts: 324



WWW
« Reply #13 on: June 26, 2004, 02:17:41 PM »

I set up the Bayesian options in Spam Assassin and I fed it over 1000 spam messages, plus a hundred good emails. The training script seemed to work ok -- it counts the messages it processes.

However, none of the emails has a BAYES_nn in the X-Spam-Status header.

Here's my user_prefs:

Code:
# SpamAssassin config file

# How many hits before a message is considered spam.
required_hits 5.8

# Whether to change the subject of suspected spam
rewrite_subject  0

# Encapsulate spam in an attachment
report_safe             0

# Use terse version of the spam report
use_terse_report     0

# Enable the Bayes system
use_bayes               1

# Enable Bayes auto-learning
auto_learn              1

# Other Bayes stuff
bayes_path /home/[deleted actual value]/.spamassassin/bayes
bayes_file_mode 0600
bayes_ignore_header X-MailScanner
bayes_ignore_header X-MailScanner-SpamCheck
bayes_ignore_header X-MailScanner-SpamScore
bayes_ignore_header X-MailScanner-Information
Logged

Kevin
voman
Trekkie
**
Offline Offline

Posts: 17


« Reply #14 on: March 05, 2005, 09:39:14 AM »

Quote from: w98
The first line must be
Code:
#!/usr/bin/perl

Looks like you were missing the # at the start of the first line.


I am getting the same Internal Server Error message. I've followed the directions exactly. Here is my sa-learn.cgi file contents:

Code:
#!/usr/bin/perl

my $salearn = "/usr/bin/sa-learn" ;
$| ;

print "Content-type: text/plain\n\n" ;

print "Learning SPAM:\n" ;
print `$salearn -p /home/voman02/.spamassassin/user_prefs --mbox --spam --showdots /home/voman02/mail/myspam` ;
print "\n\n" ;

print "Learning HAM:\n" ;
print `$salearn -p /home/voman02/.spamassassin/user_prefs --mbox --ham --showdots /home/voman02/mail/myham` ;
print "\n\n" ;

exit ;
Logged
Pages: [1] 2   Go Up
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.3 | SMF © 2006-2007, Simple Machines LLC
Seo4Smf v0.2 © Webmaster's Talks


Valid XHTML 1.0! Valid CSS! Dilber MC Theme by HarzeM