You are missing our premiere tool bar navigation system! Register and use it for FREE!

NukeCops  
•  Home •  Downloads •  Gallery •  Your Account •  Forums • 
Readme First
- Readme First! -

Read and follow the rules, otherwise your posts will be closed
Modules
· Home
· FAQ
· Buy a Theme
· Advertising
· AvantGo
· Bookmarks
· Columbia
· Community
· Donations
· Downloads
· Feedback
· Forums
· PHP-Nuke HOWTO
· Private Messages
· Search
· Statistics
· Stories Archive
· Submit News
· Surveys
· Theme Gallery
· Top
· Topics
· Your Account
Who's Online
There are currently, 149 guest(s) and 0 member(s) that are online.

You are Anonymous user. You can register for free by clicking here
Nuke Cops :: View topic - Denial of Service Attack, or a Bug of Some Kind? [ ]
 Forum FAQ  •  Search  •   •  Memberlist  •  Usergroups   •  Register  •  Profile •    •  Log in to check your private messages  •  Log in

 
Post new topic  Reply to topicprinter-friendly view
View previous topic Log in to check your private messages View next topic
Author Message
tkevans
Nuke Soldier
Nuke Soldier


Joined: Jul 14, 2003
Posts: 29

Location: Baltimore, MD, USA

PostPosted: Wed Sep 22, 2004 6:09 am Reply with quoteBack to top

Over the past couple of weeks, I have been noticing unusual activity on my verision 7.4 site. News articles are suddenly getting far, far more reads than has been the prevailing usage. While at first you might think this it's good to get more traffic, it's the patterns that smell funny.

Specifically, the numbers of reads for all articles on the main page go up by exactly the same increment. For example, each one goes up by 10 or 20 (usually an even number) once or twice a day. I know all my articles are not equally popular, and seeing each one of them getting precisely equal additional reads just seems out of line.

So far, the increased traffic is not so substantial as to constitute a denial of service attack, but it's strange nonetheless.
Find all posts by tkevansView user's profileSend private messageVisit poster's websiteAIM Address
tkevans
Nuke Soldier
Nuke Soldier


Joined: Jul 14, 2003
Posts: 29

Location: Baltimore, MD, USA

PostPosted: Mon Oct 04, 2004 5:19 am Reply with quoteBack to top

Just to follow up my own posting here:

This is a rude search engine (news.allresearch.com) repeatedly spidering the site, following all links every few hours. This would account for the pretty much exactly equal numbers of reads of all news articles.
Find all posts by tkevansView user's profileSend private messageVisit poster's websiteAIM Address
wandering_goliard
Nuke Cadet
Nuke Cadet


Joined: Jul 19, 2003
Posts: 4


PostPosted: Sat Oct 16, 2004 7:38 pm Reply with quoteBack to top

I'll back this up, as well...they started banging the crap out of my site on 17 September, every day, same thing. I finally sent them a strongly worded email tonight, as well as banning the domain .allresearch.com from my sites.
Find all posts by wandering_goliardView user's profileSend private message
tkevans
Nuke Soldier
Nuke Soldier


Joined: Jul 14, 2003
Posts: 29

Location: Baltimore, MD, USA

PostPosted: Sun Oct 17, 2004 7:49 am Reply with quoteBack to top

They seem to start out with the backen.php RSS feed, which would be fine if that's all they did. In fact, however, they open each item in the RSS feed, then follow all the links on the page, which includes links to the home/news module, your topics links, you old news links, and all the rest.

So, all the links on that page, and the home/news, and all the others, page get followed. Then, they hit the next RSS entry, and do it all over again.

So, then, every few hours, this site scans pretty much everything on your site.
Find all posts by tkevansView user's profileSend private messageVisit poster's websiteAIM Address
Noah977
Nuke Cadet
Nuke Cadet


Joined: Oct 19, 2004
Posts: 2


PostPosted: Tue Oct 19, 2004 12:17 am Reply with quoteBack to top

Hi,

I'm one of the authors of the system in question.

We're attempting to develop an RSS search engine. We DON'T spider any sites with this code.

The logic is simple.

1) Fetch an RSS feed.
2) Visit the links listed in the RSS feed.
3) Look for "timing clues" in the RSS feed that will tell us when to re-visit. These "clues" understood by our software include pdateFrequency, updatePeriod, updateBase, ttl, skipDays, skipHours, The “e-tag” HTTP header, The “Last-Modified” HTTP header.

If there are not timing clues at all, then we will re-visit in one hour.

Our goal is not to cause any kind of DOS or problem, but to be good netziens and follow published protocol. It is our udnerstanding of the RSS protocoll that a feed page will tell you when to re-visit for new content.

If there is a bug somewhere in our code, and our software is not correctly picking up a timing clue, then please let me know. We would be very appreciative and happy to fix it.
Find all posts by Noah977View user's profileSend private message
tkevans
Nuke Soldier
Nuke Soldier


Joined: Jul 14, 2003
Posts: 29

Location: Baltimore, MD, USA

PostPosted: Tue Oct 19, 2004 4:14 am Reply with quoteBack to top

As the original poster on this, I can assure you my site logfiles show repeated loads of not only the RSS feed, but every link on every page.

You need to familarize yourself with how PHP-Nuke works, with its common blocks that appear on every page. You load every page in the RSS feed, which is fine, but you then follow every link on every page. This ends up with you loading every link on every page every time you scan the site--and that means repeated loads of the same pages.

I can show you multiple (maybe 10-15 of them) loads of the same pages within just a minute or two, pretty much hourly.

In any event, my ISP has blocked you altogether (not only for my site, but all others they host), since you're clearly using up excessive bandwidth.
Find all posts by tkevansView user's profileSend private messageVisit poster's websiteAIM Address
Noah977
Nuke Cadet
Nuke Cadet


Joined: Oct 19, 2004
Posts: 2


PostPosted: Fri Oct 22, 2004 8:14 am Reply with quoteBack to top

Hi,

I'm not trying to be argumentative, but what you are describing is simply impossible. I KNOW what the code does. It simply DOESN'T spider. We wrote it from scratch, so we are very aware of what it does and doesn't do. Spidering a site involves a loop that strips links from each page visited and appends them to a data structure. We don't do that.

If you want to send me some log entries, I'll happily look at them.

Thanks,

-N
Find all posts by Noah977View user's profileSend private message
jfesler
Nuke Cadet
Nuke Cadet


Joined: Oct 24, 2004
Posts: 2


PostPosted: Sun Oct 24, 2004 7:53 am Reply with quoteBack to top

Alas, their crawler doesn't do If-Modified-Since and try to stick to using their own cache. I can confirm their crawler rudenes. Additionally, they don't ever request /robots.txt and follow the robots.txt standards.

Between what I've seen on the web of this outfit (non caching *hourly* crawling of destination links, no robots.txt handling, they service commercial accounts only and provide nothing to the general public).. and what I've seen in my logs.. I've simply null-routed them.
Find all posts by jfeslerView user's profileSend private message
oprime2001
Lieutenant
Lieutenant


Joined: Jul 13, 2003
Posts: 165


PostPosted: Sun Oct 24, 2004 10:07 am Reply with quoteBack to top

what user agent shows up in your logs -- just in case I want to add it to my list of banned bots?
Find all posts by oprime2001View user's profileSend private message
jfesler
Nuke Cadet
Nuke Cadet


Joined: Oct 24, 2004
Posts: 2


PostPosted: Sun Oct 24, 2004 10:26 am Reply with quoteBack to top

Looks like it is behind nat. The one with the rude behavior
is: "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)"
coming from 38.144.36.16.

The one with what looks like more reasonable behavior is:
"Mozilla/4.0 (compatible; MSIE 5.12; Mac_PowerPC)"; the mac
appears to be doing a proper If-Modified-Since (I *am* seeing 304's, and the links to .txt files are not being followed by the mac).
Find all posts by jfeslerView user's profileSend private message
Display posts from previous:      
Post new topic  Reply to topicprinter-friendly view
View previous topic Log in to check your private messages View next topic
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum



Powered by phpBB © 2001, 2005 phpBB Group

Ported by Nuke Cops © 2003 www.nukecops.com
:: FI Theme :: PHP-Nuke theme by coldblooded (www.nukemods.com) ::
Powered by TOGETHER TEAM srl ITALY http://www.togetherteam.it - DONDELEO E-COMMERCE http://www.DonDeLeo.com - TUTTISU E-COMMERCE http://www.tuttisu.it
Web site engine's code is Copyright © 2002 by PHP-Nuke. All Rights Reserved. PHP-Nuke is Free Software released under the GNU/GPL license.
Page Generation: 0.603 Seconds - 116 pages served in past 5 minutes. Nuke Cops Founded by Paul Laudanski (Zhen-Xjell)
:: FI Theme :: PHP-Nuke theme by coldblooded (www.nukemods.com) ::