Web Hosting Forum | Lunarpages
News: October 6, 2008 - Submit Your Site for the October 2008 Site of the Month!
 
*
Welcome, Guest. Please login or register.
Did you miss your activation email?
October 12, 2008, 09:33:06 PM


Login with username, password and session length


Pages: [1]   Go Down
  Print  
Author Topic: Raw logs questions  (Read 96 times)
Dr_Test
Trekkie
**
Offline Offline

Posts: 10


« on: July 02, 2008, 04:02:09 AM »

I have some questions about raw logs:

1: Are Raw logs supposed to show *everything* that's going on in my site? Sometimes I think it's missing things... Example: Often, someone will load my main page, but instead of me getting a log entry showing that they're downloading index.htm, I'll get one showing that they're downloading header.htm (an inline frame on index.htm). Am I missing something here?

2: Also, if there is NO reference in my logs to someone downloading a certain file, does that mean the file was never downloaded, period? I'm wondering how the Yahoo bot used up 20 gigs last month on my tiny site, if it didn't download my huge RAR archives. (There's no mention of them ever being downloaded in the logs...)

3: Are the times reported in the logs relevant to MY time (US Pacific), or Lunarpages' time?

4: Anyone know a resource that explains what ALL fields in an entry mean? An entry looks like this:
Quote
66.151.164.208 - - [30/Jun/2008:01:36:50 -0700] "GET /forum/index.php?action=who HTTP/1.1" 200 2596 "http://www.gunreal.com/forum/index.php" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.14) Gecko/20080404 Firefox/2.0.0.14"
I highlighted in bold the parts I'm wondering about...

Thanks!
« Last Edit: July 03, 2008, 12:20:32 AM by Dr_Test » Logged
Vitalian
Spaceship Navigator
*****
Offline Offline

Posts: 75



« Reply #1 on: July 03, 2008, 08:41:43 AM »

1. Yes, the raw logs show everything that goes on on your site, as far as they've been configured to do. When using frames, the browser downloads each frame individually, so you'll have a call for both index.htm and header.htm.

2. I'm not quite sure. As far as I believe, it should mean that the file was never downloaded. And, I don't think Yahoo would have used up 20 gigs. I don't think that indexing your entire forums would have taken up that much bandwidth. And Yahoo wouldn't download your .rar files. It would just make note of them. (If it did, Yahoo would suffer major problems trying to index file-hosting sites!)

3. Your log files should be relevant to the area where your server is located. According to the log snippet you posted, your server is located in an area using Mountain Standard Time (probably Arizona).

4. To quickly get you started on what each field is:
  • 66.151.164.208 - The IP address making the request.
  • The first hyphen is normal as it indicates that the information for that specific field could not be found. This information is normally not found in most log files.
  • The second hyphen again refers to information the can not be found. This hyphen is replaced by a userid when mod_auth is being used.
  • [30/Jun/2008:01:36:50 -0700] - The datestamp. The request was made on June 30, 2008 at 1:36 and 50 seconds AM. -0700 refers to the timezone. Mountain Standard Time is always UTC-7 or UTC -0700 hours.
  • "GET /forum/index.php?action=who HTTP/1.1" is the request made by the browser. The browser is requesting to be sent /forum/index.php?action=who over HTTP version 1.1
  • 200 - the status code returned. 200 indicates a status of "OK" and is the normal result for any existing web page (besides 304 - "Not Modified").
  • 2596 refers to the size of the data sent back to the browser.
  • "http://www.gunreal.com/forum/index.php" is the referrer for the request.
  • "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.14) Gecko/20080404 Firefox/2.0.0.14" is the "User-Agent HTTP request header." It's just like a server signature. So, we know the request was made from a Windows XP computer, that the browser uses the Gecko rendering interface, and that the browser is Mozilla compatible. The browser also happens to be Firefox 2.0.0.14. There's a lot of info that can be pulled out of this.

    You can fine the entire reference here.


Good thing for Restore Session in Firefox or I would have lost this entire post after typing the whole thing.
Logged
Dr_Test
Trekkie
**
Offline Offline

Posts: 10


« Reply #2 on: July 03, 2008, 08:22:05 PM »

Quote
1. Yes, the raw logs show everything that goes on on your site, as far as they've been configured to do. When using frames, the browser downloads each frame individually, so you'll have a call for both index.htm and header.htm.
See, that's what I don't get. There often is no log entry for index.htm, or the other way around (there's a log entry for index.htm, but not header.htm).

Quote
2. I'm not quite sure. As far as I believe, it should mean that the file was never downloaded. And, I don't think Yahoo would have used up 20 gigs. I don't think that indexing your entire forums would have taken up that much bandwidth. And Yahoo wouldn't download your .rar files. It would just make note of them. (If it did, Yahoo would suffer major problems trying to index file-hosting sites!)
I don't understand how it used 20 gigs then. (note: I've read before that Yahoo is a big bandwidth hog... I'm just trying to figure out HOW, as I didn't see it downloading anything big on my site --- the reason I wonder is to figure out if and how my raw logs are working. I don't actually care what the bots are doing: I just want everything to be recorded).

Quote
4. To quickly get you started on what each field is:
Thanks a lot for your help. Smile
« Last Edit: July 03, 2008, 08:24:51 PM by Dr_Test » Logged
Vitalian
Spaceship Navigator
*****
Offline Offline

Posts: 75



« Reply #3 on: July 04, 2008, 03:02:29 AM »

Oh, Oh!! I just looked at my file and I may have found something. Requests for index.htm may sometimes be listed as "GET / HTTP/1.1". This means the browser was requesting the index of the site, but referenced it as nothing so that it got the default page. These entries should be directly followed by a call for header.htm. Anything that requests index.htm is probably someone who has made your site a favorite and is requesting the specific page. I don't know why that wouldn't be followed by a header.htm request though.

As for why Yahoo is using up so much bandwidth, I have no idea. My server at my house is regularly hit by search engines, but I'm on vacation and have no access to the log files to see what is logged. Just make sure your not looking at people coming from a Yahoo search that are using up the bandwidth. In this case, Yahoo would be listed as the referring address. Even if you're not worried, I would seriously looking as to why Yahoo is downloading 20 gigs. That's serious abuse to the server for just a search engine!

Always happy to help ^^
Logged
Pages: [1]   Go Up
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.6 | SMF © 2006-2008, Simple Machines LLC

Valid XHTML 1.0! Valid CSS! Dilber MC Theme by HarzeM