Web Hosting Forum | Lunarpages
News: Server Migrations to San Diego: Deneb, Felix and Tsohea are moving to San Diego starting Tuesday, December 30, 2008 at 10pm Pacific. They will complete their moves Friday night, January 2, 2009

Isis, Seth and Ez-web-n-mail will move physically on Friday, January 2, 2009

Please see the forum posts at http://www.lunarforums.com/lunarpages_web_hosting_server_information-b54.0/

+ Submit Your Own Web Site for the January 2009 Site of the Month Contest!
 
*
Welcome, Guest. Please login or register.
Did you miss your activation email?
January 07, 2009, 08:46:03 PM


Login with username, password and session length


Pages: 1 [2]   Go Down
  Print  
Author Topic: How to Install and Use Your Own Webalizer  (Read 7569 times)
jdrew
Spacescooter Operator
*****
Offline Offline

Posts: 34


« Reply #15 on: July 19, 2007, 07:42:11 PM »

Hmmmm....  Very strange.  I have 'archive logs' checked and 'delete raw logs' unchecked.  I just set that about a week ago, but it's been set for at least that long.  When I download the gz file from 'Raw Access Logs', it's a relatively small file.  When I run webalizer on it, I just get two days.  When I download the file from 'Raw Log Manager' (which I would think is the same log file), the file is considerably larger but still only shows 2 days worth when I webalize it.

For example, I downloaded it two days ago from both spots and it showed only the 16th and 17th.  Downloaded it yesterday and it showed only the 17th and 18th.  Now, today it just shows the 18th and 19th.  Does it take time for that log manager to catch up or something?  Clearly it is not accumlating for me.  Is this something I should contact support about or is there a simple explanation?

Thanks for all of your help.
Logged
SteveW
Master Jedi
*****
Offline Offline

Posts: 1394


WWW
« Reply #16 on: July 20, 2007, 03:17:37 AM »

It doesn't require any time to get caught up to the new setting.  It should take effect immediately. 

It occasionally happens, though, that the stats data processing fails to run every day.  It would be plain bad luck if this is happening to you at a time when you're just getting started with this process, adding to the confusion.  Yep

In cpanel, go to either Analog or Webalizer (or both) and check the timestamp at the top of the page showing when the stats were last run.  It should show a recent date, either yesterday or today.  If it doesn't, your stats have stopped running.  There are some previous posts here in the forum from when it happened to other people, including myself.

If they've stopped, it's worth submitting a support ticket.  When LP gives the stats processing a kick, it might get them running again on a daily basis, or it might not. In my case, they started running again a few weeks later, as abruptly as they had stopped. 

When the stats are not being processed daily, you can still get the log data, except from a different place. 

Go to cpanel > Raw Access Logs (?, it's right next to Raw Log Manager), and download that file instead.  That file is where your raw data accumulates before the stats run.  Then the stats are run on it, and the data is transferred to the file where you get it in Raw Log Manager. If you do get the file from Raw Access Logs, don't try to find and delete the file, because the data hasn't been processed yet.

I found some of the previous threads. Search these forums on these terms:
stats run raw log
for posts from
SteveW

Threads near the top of the list will have discussions that I think will be useful if your stats have stopped running. 
Logged





Mt. Shasta
photo gallery.


Don't forget Lunarpages 24/7/365 support documentation:
Flash Tutorials, Knowledge Base FAQ Articles, cPanel Manual, Glossary/Dictionary, Support Tickets,
and
Forum Search.

jdrew
Spacescooter Operator
*****
Offline Offline

Posts: 34


« Reply #17 on: July 20, 2007, 05:25:24 AM »

Thanks very much.  I'm still having lots of problems, but it's off-topic for this thread.  I'll try the searches you mentioned and/or open a new thread to get help.
Logged
dbrewster
Spaceship Captain
*****
Offline Offline

Posts: 109


« Reply #18 on: August 03, 2007, 07:04:22 AM »

Thanks for this topic, very helpful.

I've been having 2 problems with Webalizer, maybe LP staff or customers can help!

1) I am running Webalizer on my PC computer. When I run it on the raw uncompressed log file, I get many error messages: [new_snode] Warning: String exceeds storage size.
For July, out of 3,630,643 records, 107 were ignored, 3 were bad, so it's a very small percentage. Still, I wonder about this. PC limitation?

2) I can't seem to get webalizer.conf configured to screen out referrers that are from my own site, which makes it a hassle to try and analyze results. Here's my configuration for hiding, I've played with it, evidently I don't grasp the matching rules.

216.12.92.18 is our local network address, I don't want to include stats from me hitting my own website. mail.coreknowledge.org is our internal mail server.

around line 376:

Code:
# The value can have either a leading or trailing '*' wildcard
# character.  If no wildcard is found, a match can occur anywhere
# in the string. Given a string "www.yourmama.com", the values "your",
# "*mama.com" and "www.your*" will all match.

# Your own site should be hidden
HideSite coreknowledge.org*
HideSite 216.12.92.18
#HideSite localhost

# Your own site gives most referrals
HideReferrer *coreknowledge.org
HideReferrer Direct Request

and later, around line 476:
Code:
# The Ignore* keywords allow you to completely ignore log records based
# on hostname, URL, user agent, referrer or username.  I hessitated in
# adding these, since the Webalizer was designed to generate _accurate_
# statistics about a web servers performance.  By choosing to ignore
# records, the accuracy of reports become skewed, negating why I wrote
# this program in the first place.  However, due to popular demand, here
# they are.  Use the same as the Hide* keywords, where the value can have
# a leading or trailing wildcard '*'.  Use at your own risk ;)

#IgnoreSite bad.site.net
#IgnoreURL /test*
IgnoreReferrer *coreknowledge.org
IgnoreReferrer 216.12.92.18
#IgnoreAgent RealPlayer
#IgnoreUser     root

# The Include* keywords allow you to force the inclusion of log records
# based on hostname, URL, user agent, referrer or username.  They take
# precidence over the Ignore* keywords.  Note: Using Ignore/Include
# combinations to selectivly process parts of a web site is _extremely
# inefficent_!!! Avoid doing so if possible (ie: grep the records to a
# seperate file if you really want that kind of report).

# Example: Only show stats on Joe User's pages...
IgnoreURL 216.12.92.18
IgnoreURL mail.coreknowledge.org
#IncludeURL ~joeuser*

# Or based on an authenticated username
IgnoreUser     216.12.92.18
#IncludeUser    someuser


Thanks for any guidance to help me get this sorted out!

---Diana

Logged
SteveW
Master Jedi
*****
Offline Offline

Posts: 1394


WWW
« Reply #19 on: August 03, 2007, 02:04:45 PM »

1) String exceeds storage size

I think Webalizer can only handle a string length of 256 in any field. Some search engines provide query strings in the Referrer field that exceed 256 characters. If you check a few of those 107 records, I suspect you'll find that they have long referrer fields.  I don't recall whether Webalizer truncates the field length (which is pretty much ok) or ignores the record entirely and doesn't import it (which would be slightly less ok from a statistics standpoint).

2) Try this. I don't use Webalizer a lot anymore, but I think this is what worked for me. It's modified from my own .conf file:

Code:
# Your own site should be hidden
HideSite http://coreknowledge.org
HideSite http://www.coreknowledge.org

# Your own site gives most referrals
HideReferrer http://coreknowledge.org
HideReferrer http://www.coreknowledge.org

I wouldn't hide direct request.

Those are the only two locations in Webalizer.conf where I refer to my own site, so I'm unsure whether some of your other such entries are necessary, or whether those entries might be hiding data that you would actually want to see. 

The place where I exclude my own visits looks like this:

Code:
GroupSite 111.222.* myISP.tld-111.222.*
HideSite 111.222.*
« Last Edit: August 03, 2007, 02:07:29 PM by SteveW » Logged





Mt. Shasta
photo gallery.


Don't forget Lunarpages 24/7/365 support documentation:
Flash Tutorials, Knowledge Base FAQ Articles, cPanel Manual, Glossary/Dictionary, Support Tickets,
and
Forum Search.

Emptyeye
Trekkie
**
Offline Offline

Posts: 15


WWW
« Reply #20 on: October 04, 2007, 04:05:25 PM »

Quick question:

I see LP now provides Webalyzer within cpanel. Is there a way to access the .conf file for this particular version? It doesn't seem to be in any obvious place, and the "Statistics Software Configuration" in cpanel just leads to some broken image links called "stats disabled"
Logged

SteveW
Master Jedi
*****
Offline Offline

Posts: 1394


WWW
« Reply #21 on: October 04, 2007, 07:07:28 PM »

On a shared server, there is no way to customize Webalizer (the .conf file) or the other stats program, Analog. Just ignore the "Statistics Software Configuration", as it is not user-configurable.

You can install Webalizer on your own PC and configure it however you want for analyzing your site access logs.

Look around at other stats programs, too. The one I like is Google Analytics.  After the initial setup, which looks intimidating but actually took me less time than it took to configure Webalizer, it provides very detailed and interesting reports.

Your raw access logs, as text files or after import into a database, allow searches and queries that no stats program can do.
Logged





Mt. Shasta
photo gallery.


Don't forget Lunarpages 24/7/365 support documentation:
Flash Tutorials, Knowledge Base FAQ Articles, cPanel Manual, Glossary/Dictionary, Support Tickets,
and
Forum Search.

Pages: 1 [2]   Go Up
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.6 | SMF © 2006-2008, Simple Machines LLC

Valid XHTML 1.0! Valid CSS! Dilber MC Theme by HarzeM