Web Hosting Forum | Lunarpages

Author Topic: How-to: Train SpamAssassin - Updated April 27, 2010  (Read 189555 times)

Offline quilthug

  • Newbie
  • *
  • Posts: 5
Re: How-to: Train SpamAssassin - Updated May 30 2007
« Reply #255 on: April 15, 2009, 10:40:49 AM »
Ok so I think I followed the directions step by step. I'm not an expert, but I've been around the block a few.

I put in my address:
http://www.mydomain.com/cgi-bin/sa-trainer.cgi

and I get the following:
sa-trainer.cgi version 3.04 by Ian Douglas, iandouglas.com, Copyright 2004-2007
Some Rights Reserved under a Creative Commons "Attribution Non-commercial" license
Support for this script available here

ERROR: Your base mail folder could not be found. Please configure the $base_mail_folder variable within the script.. Execution cannot continue until this is fixed


any suggestions??

Offline jtaylor379

  • Newbie
  • *
  • Posts: 1
Re: How-to: Train SpamAssassin - Updated May 30 2007
« Reply #256 on: April 20, 2009, 05:37:36 PM »
Hmm, I don't have a cgi-bin folder. Under public_html I do have a _vti_bin folder. Little help here?

Cheers,
Jessica

Offline jimlongo

  • Intergalactic Superstar
  • *****
  • Posts: 153
    • Rhythm Division
Re: How-to: Train SpamAssassin - Updated May 30 2007
« Reply #257 on: April 30, 2009, 11:45:45 AM »
Hi, 2 questions.

1. I've enabled SpamAssassin and enabled the Spam Box, however I thought that this action created the Spam box.  i don't see it anywhere in my server's mail folder.

2.  The User Prefs file that gets downloaded with v3.04 is different than the user prefs included in your instructions.  Which one should I use?

Thanks,
jim

Offline angelad

  • Trekkie
  • **
  • Posts: 19
Re: How-to: Train SpamAssassin - Updated May 30 2007
« Reply #258 on: June 22, 2009, 10:20:54 AM »
Ok so I think I followed the directions step by step. I'm not an expert, but I've been around the block a few.

I put in my address:
http://www.mydomain.com/cgi-bin/sa-trainer.cgi

and I get the following:
sa-trainer.cgi version 3.04 by Ian Douglas, iandouglas.com, Copyright 2004-2007
Some Rights Reserved under a Creative Commons "Attribution Non-commercial" license
Support for this script available here

ERROR: Your base mail folder could not be found. Please configure the $base_mail_folder variable within the script.. Execution cannot continue until this is fixed


any suggestions??

Mail folder is completely missing in your case?

Offline rana9903

  • Newbie
  • *
  • Posts: 1
Re: How-to: Train SpamAssassin - Updated May 30 2007
« Reply #259 on: June 22, 2009, 07:43:37 PM »
Hello,
I am getting  following message and I am not sure if the script is working:
-----------------------------------------------------------------------------
sa-trainer.cgi version 3.04 by Ian Douglas, iandouglas.com, Copyright 2004-2007
Some Rights Reserved under a Creative Commons "Attribution Non-commercial" license
Support for this script available here

Training SpamAssassin for dunhillbd.com:
Checking /home/username/mail/mydomain.com/user1/.spam/cur/ to learn SPAM:

-----------------------------------------------------------------------------
I does not show anything else

At the bottom Status bar of  Fire ox is showing : Done.
Please    :help: me to make this working ....


My user prefix :
-----------------------------------------
user_prefs
File Type: ASCII text
------------------------------------------
use_bayes 1
bayes_file_mode 0600
bayes_ignore_header X-MailScanner
bayes_ignore_header X-MailScanner-Information
bayes_ignore_header X-MailScanner-SpamCheck
bayes_ignore_header X-MailScanner-SpamScore
bayes_path /home/masterUser/.spamassassin/bayes
required_score 4.0
rewrite_header subject {SPAM _SCORE(0)_}
-------------------------------------------------------

Also deleted the file:(As per previous thread)
 
   bayes_seen   10528 k   0600
   bayes_toks   4704 k   0600

Thanks a lot

Offline eluke

  • Newbie
  • *
  • Posts: 2
Re: How-to: Train SpamAssassin - Updated May 30 2007
« Reply #260 on: July 24, 2009, 08:12:44 AM »
I am having problems with my script.  Here are my two problems.  I created the script using the builder.  I followed all of the directions, including changing the permissions to 755 on the script.

Issues

1.       The script runs, but the bayes_toks and bayes_seen files are not created.
2.       When I run the script, I get “Learned tokens from 0 messages."

bayes_file_mode 0600
bayes_ignore_header X-MailScanner
bayes_ignore_header X-MailScanner-Information
bayes_ignore_header X-MailScanner-SpamCheck
bayes_ignore_header X-MailScanner-SpamScore
bayes_path /home/tsadmin/.spamassassin/bayes
required_score 3.5
rewrite_subject 1
subject_tag {SPAM _SCORE(0)-}

Offline BadCam

  • Pong! (the videogame) Master
  • *****
  • Posts: 23
Re: How-to: Train SpamAssassin - Updated May 30 2007
« Reply #261 on: January 16, 2010, 02:04:18 PM »
I get the following error:

Software error:
Could not connect to iandouglas.com to check for a new version of this software.<br />

I hadn't ran the script in awhile.  It use to work but now I get this error.  How do I fix this.

Thanks

Now I have this same error. I tried the fix here:

Thanks.  Just so everyone is clear line 505 should be

print '<p><a href="/cgi-bin/'.$0.'">re-scan mailboxes</a><br />' ;

Notice the added " prior to the >

Thanks

and now I'm getting the following errors:

Quote
syntax error at train.cgi line 511, near "if ($mailformat eq 'Maildir' && ( -e "$spambox"
"use" not allowed in expression at train.cgi line 526, at end of line
"use" not allowed in expression at train.cgi line 529, at end of line
Can't find string terminator '"' anywhere before EOF at train.cgi line 531.

Just for some further info, this is the top part of my train.cgi file:

Code: [Select]
#!/usr/bin/perl
use CGI::Carp qw(fatalsToBrowser) ;
print "Content-type: text/html\n\n" ;

#####[[[
# sa-trainer.cgi
$version = "3.04" ;
#
# sa-trainer.cgi by Ian Douglas, iandouglas.com, Copyright 2004-2007
# Some Rights Reserved under a Creative Commons "Attribution Non-commercial"
# license, http://creativecommons.org/licenses/by-nc/3.0/
# (you are free to use, copy and modify this code and redistribute it, but
# please do give credit where it's due (to me), and your redistribution must
# NOT be for commercial purposes -- you got it from me for free, do the same
# for others please)
#
# To reach me for support, please contact me via Email at either of the
# following addresses: ian.douglas@iandouglas.com or wild98@gmail.com
#
# This script has always been, and will continue to be, free of charge to
# obtain. If you'd like to show appreciation for the work that's gone into it,
# you're more than welcome to send in a PayPal donation of any amount, however
# you are under NO OBLIGATION whatsoever to donate for my time. ;o)
#]]]
#####[[[ CONFIGURATION
#
# Everything under here should be pretty self-explanatory; you're always
# welcome to visit iandouglas.com if you have any questions.
#
# NOTE TO ADVANCED USERS: if you are specifying mailbox names in the
# configuration and you KNOW your mail storage is Maildir, do NOT include
# the '.' prefix on the mailbox name, the script will insert it for you
# where necessary

#####
# setting this to 'Y' will trigger a callback check to iandouglas.com to make
# sure you are running the latest copy of the script; this is totally optional
# and no personal data is sent from your system -- it merely retrieves the
# latest version number and compares it to this version of the script, and
# notifies you if my copy at iandouglas.com is newer.
# Commenting out this line, deleting it completely, or setting it to anything
# other than a capital 'Y' value will turn off the callback feature.
# THIS IS SAFE TO LEAVE SET TO "Y" UNLESS iandouglas.com IS OFFLINE
$callback_to_iandouglasdotcom = "Y" ;

#####
# your domain name; this is used to find the correct path for your Email
# folders; do not include "www." ALL USERS **MUST** SET THIS VARIABLE.
$my_domain = "XXXX.co.nz" ;

#####
# your CPanel login name; this is also used to find the correct path for
# finding your Email folders, and must be entered exactly the same way as you
# would use it in your FTP program. ALL USERS **MUST** SET THIS VARIABLE.
$cpanel_username = "XXXXXX" ;

#####
# if you know absolutely which mail format your Email is stored in (Mbox or
# Maildir), please uncomment the appropriate line below. If you don't know,
# or your server is subject to change at some point, leave both lines commented
# and the script will attempt to autodetect it for you. MOST USERS WILL NOT
# NEED TO ENABLE EITHER OF THESE LINES.
#$mail_format = "Mbox" ;
#$mail_format = "Maildir" ;

#####
# if your users will forward ham (non-spam) messages to a new Email address to
# scan, please uncomment the following line and enter the username portion of
# the Email address they will forward to; for example, if the mailbox is
# globalham@mydomain.com, just enter "globalham" and nothing more.
# note: this MUST be a mailbox within the domain name you configured above
# as $mydomain. MOST USERS WILL NOT NEED TO ENABLE THIS VARIABLE
$global_ham_email = "globalham" ; # @ mydomain.com

#####
# if you manually collect all ham messages into a global ham folder yourself,
# you should comment out the $global_ham_email variable AND the Inbox_for_ham
# variable listed below, and set this variable to the name of the folder under
# your $cpanel_username mail account where all global ham messages will be kept
# for scanning. MOST USERS WILL NOT NEED TO ENABLE THIS VARIABLE
#$global_hambox = "scan-ham" ;

#####
# if you want to scan your users' Inbox folders instead of a separate 'ham'
# folder, set the following line to "Y".
# If you are using the global Email address or $global_hambox variables
# listed above, then THIS variable MUST remain set to "N" -- you cannot scan
# both your user's Inboxes *and* a global Email account/folder for ham.
# Enabling this variable and setting it to 'N' will search for a folder called
# 'ham' within each user account. MOST USERS WILL SET THIS TO "N"
$check_user_Inbox_for_ham = "N" ;
# if the above variable is set to "N", you can enter a mailbox name here to
# scan for non-spam messages; we recomment users create a folder called "ham"
# but you can set that here to some other name
#$user_hambox = "ham" ;

#####
# scan your individual users' spam boxes; MOST USERS WILL SET THIS TO 'Y'.
# If you collect all of the spam messages into a global spam folder for all
# users as part of your $cpanel_username mail account INSTEAD, then comment
# out this line and set the global spambox setting below.
$check_user_spamboxes_for_spam = "Y" ;
# if you want your users to move their own spam to a new mailbox for scanning
# (useful so users who neglect to move spam don't bog down your script with
# thousands of old spam messages accumulating over time), you can enter a new
# mailbox name here for spam; all users will need to create this folder name,
# and only this folder name will be used for scanning ALL individual spam boxes
$user_spambox = "spam" ;

#####
# global spambox setting; generally this will not be used if your users have
# their own spam folders. If the $check_user_spamboxes_for_spam variable above
# is set to 'Y', this line should be commented out; if you do NOT want to use
# your users' individual spam folders, then set the name of the folder under
# your $cpanel_username mail account where spam will be stored instead. Most
# users will not need to configure this, and will just use the variable above
# ($check_user_spamboxes_for_spam = "Y") instead. MOST USERS WILL NOT NEED TO
# ENABLE THIS VARIABLE
#$global_spambox = "scan-spam" ;

#####
# if you have multiple add-on domain names that you would like to check with
# one execution of this script, uncomment the following line, and enter each
# domain name within quotes inside the parentheses. Note that ALL other
# rules listed above will apply identically to all domains, and that the
# $global_hambox and $global_spambox will apply only to your $my_domain
# domain name only; to explicity *exclude* an add-on domain, you will need to
# add all other add-on domains. MOST USERS WILL NOT NEED TO ENABLE THIS
# VARIABLE.
#@addon_domain_list = ( 'addon-domain-1.com' , 'addon-domain2.com' ) ;

#####
# THE FOLLOWING FEATURE IS NOT YET ENABLED
# Thank you, Paul D., for your idea to add this as a feature!
# if you have an exclusive list of users you want to scan, or an exclusion list
# of usernames who you do NOT want to scan SPAM for, you can enter their full
# Email addresses here. The script will explicitly watch for these users only
# if they are listed without the exclusion marker -- to exclude a user, prefix
# their entry with an exclamation point such as "!excludeme@mydomain.com".
# MOST USERS WILL NOT NEED TO ENABLE THIS.
#####
# NOTE: enabling this list will scan ONLY these accounts and no others, so is
# best used only to enable a few select accounts for spam/ham or to only
# exclude certain users (and scan ALL others).
#####
#@users_to_scan = ( 'john.doe@mydomain.com' , '!excludeme@mydomain.com' ) ;

#####
# if you see an error message about not being able to detect the SpamAssassin
# training application on the server, uncomment the following line and set it
# to the path to where the "sa-learn" application is on your server (you will
# need to ask your hosting provider for this information). This is not usually
# needed. MOST USERS WILL NOT NEED TO ENABLE THIS VARIABLE
#$path_to_salearn = "/usr/bin/sa-learn" ;

#####
# if your CPanel hosting environment stores your Email in a folder other than
# 'mail' within /home/$cpanel_username/ then you should enable this variable
# and specify only the name/path of your root mail folder relative to your
# /home/$cpanel_username/ directory. Be sure to prefix it with a '/', but do
# not add a trailing slash. MOST USERS WILL NOT NEED TO ENABLE THIS VARIABLE.
#$base_mail_folder = "/mail" ; # no trailing '/' please

#####]]] CONFIGURATION IS COMPLETE!

Should I upgrade to V3.5 using the "BuildYour Own" Spam Assassin Trainer here?

http://iandouglas.com/spamassassin-trainer/

Or, should I just fix the errors in the current 3.04 version?

Also, my user_prefs file is as follows:

Code: [Select]
use_bayes 1
bayes_file_mode 0600
bayes_ignore_header X-MailScanner
bayes_ignore_header X-MailScanner-Information
bayes_ignore_header X-MailScanner-SpamCheck
bayes_ignore_header X-MailScanner-SpamScore
bayes_path /home/XXXXXX/.spamassassin/bayes
required_score 4.0
rewrite_header subject {SPAM _SCORE(0)_}
score FH_DATE_PAST_20XX 0

But I see that on the first page of this thread it should perhaps now be:

Code: [Select]
use_bayes   1
required_hits   3.5
rewrite_subject   1
subject_tag   {SPAM _SCORE(0)_}
bayes_path   /home/xt88002/.spamassassin/bayes
bayes_file_mode   0600
bayes_ignore_header X-MailScanner
bayes_ignore_header X-MailScanner-SpamCheck
bayes_ignore_header X-MailScanner-SpamScore
bayes_ignore_header X-MailScanner-Information

Please note: I have added to my existing user_prefs this line because I understand I need to do this because of the current SA issue:

Code: [Select]
score FH_DATE_PAST_20XX 0
Correct?

Anyway. I'm just guessing at all of tis, so any help would be greatly appreciated. Thanks very much in advance.  :yey:
Why is reality always so real?

Offline kdorsey

  • Trekkie
  • **
  • Posts: 16
Re: How-to: Train SpamAssassin - Updated May 30 2007
« Reply #262 on: January 17, 2010, 03:00:00 PM »
I'm going to have to go through this thread and read up.  Been ignoring this for a while, but my spam filters need adjusting, that's for sure.

Offline Nawtyflier

  • Newbie
  • *
  • Posts: 1
Re: How-to: Train SpamAssassin - Updated May 30 2007
« Reply #263 on: February 18, 2010, 06:13:29 AM »
Hi,
I don't have a hosting plan, just email with Lunarpages.  I'm guessing there's no way to properly configure SpamAssassin since I have nowhere to upload the script.  Am I correct?

Offline w98

  • Galactic Royalty
  • *****
  • Posts: 443
    • http://iandouglas.com
Re: How-to: Train SpamAssassin - Updated April 27, 2010
« Reply #264 on: April 27, 2010, 10:12:35 AM »
Hey all, quick update.

I've fixed the callback to iandouglas.com to check for a new version. I've also made a change to the first article in this thread to point users to http://iandouglas.com/sa-trainer/ for a do-it-yourself SpamAssassin Trainer Builder app that I wrote. You can essentially just answer a few questions there, and it'll build everything you need.

I've fixed some bugs in the script, and have released v3.06.


lydian

  • Guest
Re: How-to: Train SpamAssassin - Updated April 27, 2010
« Reply #265 on: June 17, 2010, 04:04:49 PM »
Hmmm - giving up at this point. Everyone says just to give up on SpamAssassin, but I thought I'd give it a last try after being sent a link to this page.
- The promised link to full documentation is nowhere to be found. Does it exist?
- Even in 3.0.6 version of script there are many typos and inconsistencies (global-ham or globalham?)
- Enabling spam box does apparently not create spam folders which are referred to in at least 3 different spellings (SPAM, Spam and spam).

So - after hours of trying and correcting all we get is "cannot scan SPAM". Maybe SpamAssassin should just be laid to rest. Rumors persist that it can be useful, but then it is really tough to find anyone that can actually verify the latter. This software is a complete joke when compared with solutions like postini or anything a client side filter does without reams of (mostly incomplete) documentation. It is time that services like LP address the spam issue which gets worse every year in a serious manner and don't just point to an unintuitive, obsolete tool like SA and tell their customers to take a couple of weeks time to learn how to configure it. 

Offline jojooboo

  • Jabba the Hutt
  • *****
  • Posts: 717
    • http://www.fflschedules.com/
Re: How-to: Train SpamAssassin - Updated April 27, 2010
« Reply #266 on: June 21, 2010, 11:32:18 AM »
I'm loving this script!!  I do think there is one error, though.  On the sa-trainer.cgi results page it seems like the "Number of HAM messages scanned over time:" and "Number of SPAM messages scanned over time:" statistics are reversed.  It seems to work as intended, however, with ever-more spam getting filtered into the Spambox.

Offline jojooboo

  • Jabba the Hutt
  • *****
  • Posts: 717
    • http://www.fflschedules.com/
Re: How-to: Train SpamAssassin - Updated April 27, 2010
« Reply #267 on: June 21, 2010, 06:47:27 PM »
This software is a complete joke when compared with solutions like postini or anything a client side filter does without reams of (mostly incomplete) documentation. It is time that services like LP address the spam issue which gets worse every year in a serious manner and don't just point to an unintuitive, obsolete tool like SA and tell their customers to take a couple of weeks time to learn how to configure it. 

I could not disagree more.  SpamAssassin has been and continues to be a great tool and with the ability to train it with the script on which this thread is based is gets better and better.  Without SpamAssassin I would get 200+ spam emails a day.  As it stands I get less than 5.  I would not want to deal with that much stuff client-side nor would I want LP to inject their own ideas about how to handle spam emails into the process. 

Offline spatters1000

  • Space Explorer
  • ***
  • Posts: 6
Re: How-to: Train SpamAssassin - Updated April 27, 2010
« Reply #268 on: July 17, 2010, 08:19:38 AM »
Hi -- Got SA running. Seems to be doing its thing okay.

Question on maintenance: I seem to remember that I have to empty some folder of messages that SA captures as spam. When SA determines that a message is spam, what does it do with it? I can't remember if it puts in some folder (hence the above question about emptying it) or if it just deletes it.

On a related question: Is there a way in this forum to search for a particular phrase or word within a specific thread? For example, I tried to find a way to search for "maintenance" within this thread to see if this question had already been addressed, but can't find a way. I tried visually scanning the posts, but gave up on that idea with so many pages of posts.

Thanks!

Offline benjazz

  • Newbie
  • *
  • Posts: 1
Re: How-to: Train SpamAssassin - Updated April 27, 2010
« Reply #269 on: August 07, 2010, 03:24:10 PM »
In the cPanel user's IMAP profile, create two new folders called "scan-ham" and "scan-spam"

how and where exactly do I do this step? In cPanel, or on my email client software (Mac OS X Mail)?

 

Share |