Quantcast
Channel: Udo's Spout Nozzle » NET
Viewing all articles
Browse latest Browse all 5

Site Under Attack From Rogue MSN Bot? Well, Tough Luck!

$
0
0

Funny thing happened to my blog recently: not only was it being hacked, there was also a DoS attack going on. The attack originated from 65.55.107.111, which made me revise my initial impression that the two events were somehow connected. See, this IP is owned by Microsoft, and the USER_AGENT string identifies the server (which made well over 1.5 Million HTTP requests in a short time frame) as an MSN search bot.

I’m hosted at MediaTemple, using the Grid Service hosting plan. That means, an attack of this sort cannot likely disable the server, since there is a whole grid that can absorb the load. However, this also means that I have to pay not only for bandwidth used but also for cluster resources such as CPU time. So what’s a site owner supposed to do in this case? Now that the episode seems to be over, I still don’t have a comprehensive answer – but maybe telling the story will help someone somewhere in some way some day. Here’s what happened:

The good thing with MediaTemple is that you get almost realtime reports regarding your resource usage. That’s how I saw that something was not right: most of my billable resources were being consumed by pages on my blog that couldn’t possibly be valid URLs. And there were already hundreds of thousands of such requests occurring targeting these URLs. Well, after downloading the logs for that day it became pretty obvious the originating server was 65.55.107.111, which resolves to

OrgName: Microsoft Corp
OrgID: MSFT
Address: One Microsoft Way
City: Redmond
StateProv: WA
PostalCode: 98052
Country: US

NetRange: 65.52.0.0 – 65.55.255.255
CIDR: 65.52.0.0/14
NetName: MICROSOFT-1BLK
NetHandle: NET-65-52-0-0-1
Parent: NET-65-0-0-0-0
NetType: Direct Assignment
NameServer: NS1.MSFT.NET
NameServer: NS5.MSFT.NET
NameServer: NS2.MSFT.NET
NameServer: NS3.MSFT.NET
NameServer: NS4.MSFT.NET
Comment:
RegDate: 2001-02-14
Updated: 2004-12-09

So far so good. Could be Microsoft, could be a spoofed attack pretending to come from MSN’s IP address. The reason why I thought an MSN server was the genuine source were the nature of the URLs used: they looked highly recursive, like someone made a horrible mistake programming their bot. And now it was stuck in an infinite loop querying my site!

First measure: block those URLs

The first and easiest thing to do was go into the WordPress code and hardwire the response to those URLs. Since they would never occur during normal web browsing anyway, this was an easy choice. I made it so WP would return no data upon such calls, so there was no further HTML that could be parsed for more recursive mayhem to be added to the bot’s to-do list. So far, so good. Because now execution was canceled as soon as the URL was called, excess CPU cycles had been cut by one-fifteenth. Not bad! But still, at the rate those requests were made, it was clear that by the end of the billing cycle I would still be well above my alotted limit. However, I felt this was all I could do on the technical side of things.

Second: contacting Microsoft

Allright, what do you do if there is a company out there, hammering your server? You write them a nice notice, informing them that they have a rogue bot, of course. Oh, how naive I was. I thought it was actually possible to contact someone, they would listen, surely discover their mistake and fix it! Ha, maybe they’d even apologize for causing me costs and workload, I thought. The hubris! There are maybe a hand full of email addresses that you can use to contact MS in case of problems. However, half of them return error message right back at you. The other half, I imagine, are just huge data graves where emails go to die. Of course, there was no help coming. It’s just not possible to reach someone who cares. I was at least hoping to get the infamous Condescending Automated Response, but apparently my problem wasn’t even worthy of that.

Third: what about MediatTemple?

Well, if MSN wasn’t going to do anything at all, maybe I could turn to my provider for help. Of course, the thing you have to keep in mind is, that MT is profiting from such things happening to their customers. Nevertheless, I wrote a diligent message detailing the problem to MediaTemple’s support. In the beginning I was even hopeful, because some first-responder sent me a mail right back explaining that my request had been escalated to a sysadmin. However, this state of hopefulness quickly faded away, when the sysadmin finally gave me my Condescending Automated Response. It explained things along the lines of “if you don’t want bots spidering your site, you can exclude them by editing the robots.txt”. Bloody brilliant, like I hadn’t already forbidden MSN to crawl my site. Like these million requests were part of a normal indexing run, sure!

Upon explaining these things again to MT support, I got a semi-useful message back: there just wasn’t anything they could do, period. Blocking this IP would mean other customers’ sites couldn’t be indexed by MSN. And I could always use an .htaccess rule to further cut down on CPU cycles. But otherwise, that’s just the risk of running a site.

And then, everything went quiet

I’m not really sure what happened next. The attack suddenly stopped. Maybe MediaTemple had suddenly recognized the fact that this wasn’t a normal bot running its index and blocked it, though I doubt it. Maybe MSN finally rebooted their server, though I’m fairly sure they didn’t even get the message that anything was wrong. Maybe it will even happen again come next indexing run. Who knows? It’s not like you get any curtesy information out of any of those companies. And if it happens again? Well, I’ll just have to pay up then, won’t I?

What little can be done

I’ve excluded MSN bots from spidering the site at several levels. It’s the least I could do. And it’s also not like there is any meaningful traffic coming through MSN search there, too. I would encourage other people to do this as well, because if an MSN bot goes rogue, there is absolutely nothing you can do against that as a lowly site owner. The least you can do to protect yourself is to pull your stuff from Microsoft-related indexes.

Udo’s Techblog

All things considered, the attack did turn out to be not so bad, but I certainly didn’t enjoy the hacking and the posting of spam in my name. These recent events have added to much of the negativity that is currently in sum making up my life. Things have been going downhill for a long time now, I just don’t know where the bottom is, yet. I guess this is also the reason for the shocking lack of original content recently. I haven’t decided what to do with the blog, yet. If nothing else, it certainly has allowed unpleasant people in my life another angle of attack. The blog comes up as the first result when someone googles my name and Analytics is telling me lots of people have been doing exactly that, recently. That’s nice as long as everything is going great. But if you’re bankrupt and overall not doing so well, it becomes another thing entirely.


Viewing all articles
Browse latest Browse all 5

Latest Images

Trending Articles





Latest Images