Web Hosting Forum | Lunarpages


*
Welcome, Guest. Please login or register.
Did you miss your activation email?



Login with username, password and session length
April 20, 2014, 11:07:09 PM

Pages: [1]   Go Down
  Print  
Author Topic: how to limit web surfers?  (Read 3140 times)
doh!
Intergalactic Superstar
*****
Offline Offline

Posts: 139


« on: January 08, 2007, 09:55:58 AM »

I was wondering how to limit web surfers to a certain stay length or number of page views. Recently some visitors are way out of the ordinary (over 30,000 page views in one visit from one IP) in number of clicks and bandwidth usage. I've been looking on google, and aside from blocking their ips, I found that there's something called QoS, that can limit visitor bandwidth and usage. But, I don't know how to apply those in our LP account as the examples I found on the internet requires patching the kernel....

Does anyone here use such methods to limit possibly malicious web traffic? perhaps giving them a warning page to wait a certain amount of time before he can click again... or some other method?
Logged
GMTurner
Berserker Poster
*****
Offline Offline

Posts: 7502



WWW
« Reply #1 on: January 08, 2007, 10:07:20 AM »

It's possible that those visitors are search engine spiders crawling your site for their index... so not sure if you would want to block them or not...


But, thinking off the top of my head of one way to do it...
- create a table in a DB that contains fields for IP address, time/date of last visit, and number of visits...
- create a PHP script to be included at the start of each page that:
  - gets the requesting IP address
  - if not in the table, adds it
  - if in the table, sees if a maximum number of visits in _x_ number of days has been reached
    - if it has been a couple of days since last visit, reset count to zero
    - if it is below the threshold, increase the count by 1 and allow access
    - if threshold has been reached, redirect to a "too many visits" page

This isn't an ideal solution and I'm sure it has more than a few problems, but it was something that came to mind right away as a starting point for approaching the problem/issue.... rather than having to ban and unban the IP every few days... though you might be able to setup a script that parses the raw log file and automatically bans/unbans IPs and then schedule that script to be run via cron once a day or so... could combine the two ideas even... and the "too many visits" page could be an explanation of why they aren't able to get there without it needing to say they are "banned" or "forbidden"... just that although you appreciate their interest in your site, they have visited xx times in the given time period and you want to make sure that others have a chance to visit the site as well...

again, just random thoughts as they popped into my head...
Logged

The above post was made at a time when I gave a dang and doesn't necessarily reflect my current views or opinions.

For those no longer with us ... Grr..!!

Turner's Lounge
doh!
Intergalactic Superstar
*****
Offline Offline

Posts: 139


« Reply #2 on: February 18, 2007, 01:53:08 PM »

I some times still get some strange traffic over to a website I'm managing.  This month it has greatly exceeded the normal amount of bandwidth usage and I can't find out how someone can click on the site more than a few thousand times in just a few hours to generate so many hits (more than a few ten thousand hits in less than 5 sessions). Normally a visitor will click a few pages and run up a few hundred to a thousand or so hits.

I've been trying to block ips, but I know there are some valid visitors from those places, so I'm still in need to find how to limit a user's click rate. I'd like to block an ip if they have clicked/made requests a certain number of times in a few seconds or have excceeded a bandwidth limit per ip.

Is there a script or some cgi program that will do this? while looking on google, I found fail2ban, but after looking into it, it seems that only bans failed login. is there such a program that will handle this kind of traffic? (it'd be great if it could also count the number of simultaneous visitors and redirect them to different web pages if there are too many users at one time.)

On the internet I found an article about banning IPs, sounds like this isn't really the way to go is there some other method?

here's an excerpt from that article:
http://kalsey.com/2004/02/why_ip_banning_is_useless/
IP addresses are easy to fake as well. The design principles of TCP/IP allows the sender of a packet to specify its IP address. The message will still be routed to its destination using the fake origin address. Return packets would be mis-routed, however, because TCP/IP would send responses to the true location of the IP address rather than where it actually came from. This means that IP spoofing is ineffective in situations where you need to interact with a remote server, but very effective in a one-way conversation. I canít retrieve a Web page using a spoofed IP address because I need to make the request and then have the server send me the page. But I can send requests all day long if I donít care about the response.

because the person can just send a bunch of data without caring for a reply, I'd like a way to stop/slow this process.

[edit] here's a link to a module that might have part of what I'm looking for but I can't be sure if it can be installed: http://dominia.org/djao/limitipconn.html
« Last Edit: February 18, 2007, 02:39:21 PM by doh! » Logged
MrPhil
Senior Moderator
Berserker Poster
*****
Offline Offline

Posts: 5809



« Reply #3 on: February 18, 2007, 02:50:40 PM »

Do any of the site statistics tools tell you the IP address of these oversized visitors? Then who are they when you feed them to a whois service (e.g., http://whois.domaintools.com/)? As GMT suggested, it might not be good to ban search engine spiders from Google or others (unless you really don't want to be listed). A couple of ways to ameliorate their impact: 1) you can set up robots.txt and/or "robots" meta tags in your pages to shut them out of places that there's no point in their crawling, or 2) I think there's some way to tell search engines to recrawl your site less often than they have been doing. I don't recall what the tag is -- you'll have to search these forums for it.
Logged

Visit My Site

E-mail Me
  
-= From the ashes shall rise a sooty tern =-
doh!
Intergalactic Superstar
*****
Offline Offline

Posts: 139


« Reply #4 on: February 18, 2007, 05:24:50 PM »

y, i have checked these out. They're coming from a few specific ips. (perhaps the same people? as when I block a few ips, then some new ones pop up a week or so later exhibiting the same pattern of mass hits and downloads.) they are not coming from spiders/robots. I do have a robots.txt, but googlebot seems to just skip over that too. If possible, I'd prefer to automatically ban/block them for a short time (say 10 mins or so), then allow them to come back later in case of people with dynamic ips.
Logged
MrPhil
Senior Moderator
Berserker Poster
*****
Offline Offline

Posts: 5809



« Reply #5 on: February 19, 2007, 08:47:18 PM »

I do have a robots.txt, but googlebot seems to just skip over that too.

That's interesting. Google's spider (googlebot) is supposed to be very well behaved about obeying robots.txt and meta directives for "robots" and/or "googlebot". Are you absolutely sure it's a googlebot in question? I don't think their IP address will jump around, so if it is, it's possible that someone is forging the identity. The other possibility is that your robots.txt is formatted incorrectly or is in the wrong place (should be in public_html/, at least for a primary domain -- I would guess in public_html/other/ for a subdomain or add-on domain).
Logged

Visit My Site

E-mail Me
  
-= From the ashes shall rise a sooty tern =-
Ed
Berserker Poster
*****
Offline Offline

Posts: 5177



WWW
« Reply #6 on: February 27, 2007, 02:12:47 PM »

If you are running and popular programs (ie wordpress) that have been exploited in the past. Scripts will try and exploit your install. Its not uncommon for me on the day I publish a blog to have a 1000 hits from a single script trying to exploit my secure script. They still feel inclined to keep trying, even when it doesn't work.

Causes some annoyying server load, and alot of spam for me to sort through in my comments.

As was mentioned earlier - having IP bans won't really protect you if the script is trying to exploit your site, and really doesnt' care what content you send it back.
Logged

Pages: [1]   Go Up
  Print  
 
Jump to: