2008-11-17: CBL-observed Effects of the McColo Outage

2008-11-20 update at end.

As some of you are aware, on November 11 at approximately 21:30 hours GMT, the Internet hosting company McColo Corporation was disconnected (by its Internet access providers) from the Internet.

For greater detail of what happened and why, the following two articles tell the story: Host of Internet Spam Groups Is Cut Off and A Closer Look at McColo.

Many people working in anti-spam/malware/phish, including the CBL team, were well aware of the issues being tracked back to McColo. Many, including the CBL, knew the magnitude of these issues, and could make reasonably sound theoretical predictions of what would happen if McColo was disconnected.

The numbers seemed ridiculous, it simply didn't seem possible that so much was dependent on just one hosting company.

But, as ridiculous as our predictions seemed to be, they were probably not high enough: Spam Volumes Drop by Two-Thirds After Firm Goes Offline.

For historical purposes, we include the effects as seen by the CBL here.

Here's another perspective on Post McColo Spam What do we see?

The CBL uses many heuristics to detect infected machines. There are two types: generalized "behavioural" detection methods of infected/compromised machines, and methods by which we can precisely identify what malware (usually BOTNET) which is responsible. The CBL was originally based on the former, but over the past year or two, methods that allow us to pinpoint identify the malware responsible have become the more effective part of our arsenal.

We call the latter "named BOT detections".

On the eve of the McColo disconnection, "named BOT" detections represented about half of the total IP addresses listed by the CBL. At that time, we measured that the named BOTs were responsible for about 68% of all of the spam the CBL detects.

The "named BOTs" are the BOTNETs that most researchers talk about, such as Srizbi, Cutwail/Pushdo, Ozdok/Mega-D, Bobax/Kraken, Rustock, Asprox, Storm, Warezov and others. Srizbi was by far the largest, running around 35% of all spam that's caught in our spam traps. Cutwail second (at around 18%), most of the others in the 5-10% range. For more information on Srizbi, see 60 Billions Spams a Day

Note on Storm: despite the media buzz and public perceptions of the present threat of the "Storm virus", the Storm virus simply hasn't been a factor in spam or infections for at least 4 months. It's notoriety is mainly due its unique technological features (peer-to-peer control, built-in capability to DDOS researchers etc), but even at its peak, it couldn't hold a candle to Cutwail and Srizbi volumes. It also has unique architectural weaknesses that make it vulnerable to countermeasures that can cause large segments of the Storm BOTNET to shut down the Storm BOT code (not the infected computer), and not able to be resurrected.

These BOTNETs consist of tens or hundreds of thousands of infected machines across the world, all obtaining their instructions from centralized "command and control" - known in the business as "C&C". A Closer Look at McColo shows many of these "C&C" facilities had been traced back to being hosted at McColo. In fact, most of the BOTNETs we mentioned by name have C&C facilities at McColo.

We can think of most of the commands that the C&C give to the infected machines as being "work orders" - a copy of the spam to send (the BOTNET machines randomize parts of the content to evade filters), and a batch of email addresses to send it to. But of course, many of these BOTNETs can be instructed to perform distributed denial of service (DDOS) attacks - eg: Storm's well-documented ability to instruct the entire BOTNET to unleash a ping attack against anyone looking too closely at the Storm malware downloading sites.

If the "C&C" were disconnected from the Internet, so the theory goes, the associated BOTNETs would not get new work orders, and hence stop what they're doing (spamming and other things).

The following major BOTNETs showed immediate effects when McColo was disconnected: Srizbi, Rustock, Asprox, Bobax, and Ozdok/Mega-D by a sudden precipitous drop in CBL detections.

Ozdok/Mega-D went virtually silent within an hour. Bobax had a big chunk (about half) taken out of it within a few hours. Srizbi, Rustock and Asprox dropped off by more than 95% of normal levels within hours. Eg: Srizbi dropped from 170,000-190,000 detections per day to about 3500. Cutwail/Pushdo lost about 15% over the first 24 hours of McColo outage. Other much lesser known BOTNETs were also impacted.

Why didn't they drop to zero immediately? Well, first off, the existing BOTs still have work orders to complete. Looks like Ozdok work orders were considerably smaller than most of the others, and thereupon stopped more quickly. Bobax wasn't impacted nearly as severely because Bobax has more than one C&C cluster, and not all of them are hosted at McColo.

Why is Cutwail still going? The reason is uncertain, but it could be a combination of multiple C&C, better failover to secondary C&C if the primary goes down, or "open ended" work orders - work orders that say "keep sending this spam to the following users until instructed otherwise" - and they never got instructed otherwise.

Srizbi in particular had a rather stupid failover mechanism as documented and exploited by FireEye, as described in "100,000 Srizbi IPs detected in 24 hours".

Notice how the list of affected BOTNETs agrees with A Closer Look at McColo?

Overall CBL IP address detections, whether they be known and named infectors (like Srizbi) or from our generalized behavioural detectors, went down by 50% (nearly 1 million IP addresses of infected machines per day to 470,000 per day) over what is normal. Almost half of our behavioural detectors dropped by more than 50%, only a small number stayed at previous levels.

In the spam trap, "named BOTNETs" dropped from the previous 68% of total "CBL caught spam" to less than 33%.

In terms of total spam volumes, just one of our spam trap servers dropped from 30 spams per second to 14 in minutes (slightly more than a 50% drop), and has stayed at that level for at least two days. Other traps the CBL uses have seen drops of 65-80%. There are many reports from around the industry talking about drops in spam volume of 60-80% or more.

Note: a few ISPs aren't seeing substantial drops. In the claims that we've been able to investigate, it turns out that the ISP wasn't measuring spam volumes directly. Instead, they were measuring secondary effects - such as user complaint volumes. But, most ISPs are using filters that do well against BOTNET spam (eg: CBL/XBL, PBL etc), and their users were getting very few of them before, so of course the complaint volume wouldn't change by much after. When their true inbound spam volumes were measured before their filters got a chance to see it, they did show a big drop.

In fact, if you noticed a steep decline in spam in your inbox as a result of the McColo disconnection, this is an indication that you need better spam filters.

Some individual machines in these BOTNETs continue to operate, send email and get detected, but in most cases this is will be because they still have "work orders" yet to complete, or are broken in some sense and are "stuck in a loop" replaying their last instructions.

At the time of writing 6 days later: Srizbi, Asprox and Rustock continue to operate at less than 2% of their former magnitude. Ozdok/Mega-D has been completely down (as in _zero_ emission) for more than 2 days except for an hour or two on Sunday Nov 16 (see below). Those four alone were responsible for more than 50% of all pre-McColo-outage spam.

The CBL also saw that Ozdok/Mega-D went down hard when Intercage (aka Atrivo) was disconnected at the end of September 2008 for much the same reasons that McColo was disconnected. It took less than 4 days for Ozdok/Mega-D to find new hosting - the CBL's experience confirm where they got it from. The Ozdok/Mega-D BOTNET was responsible for approximately 10% of all spam at the time of the Intercage outage. A drop of 10% is hard to prove amongst the normal variations of spam load, but the 50% or more as a result of McColo disconnection is unmistakeable.

BOTNET recovery

Everyone in the industry is predicting these BOTNETs to be back. The Intercage outage demonstrates that the BOTNETs will respond to such outages. If they can.

What signs have we seen?

Clearly Bobax was not entirely killed by the McColo outage. It is now working overtime sending more spam than it ever did. It is now the most prolific spam-sending BOTNET (at around 16% of all named BOTNET spam). Our detection rate of Cutwail has recovered to previous levels, however, it still seems to be having some difficulties, and it's spam volume has declined to about 9% of all spam.

Warezov had not been seen for many months, but prior to the McColo outage we were seeing hints of a reappearance. Since McColo, Warezov has struggled mightily, and has reached about 6.8% of all spam.

On Sunday, Nov 16 (yesterday) at approximately 00:00 GMT, the backbone provider Telia (or more likely, one of their reseller/customers) had been persuaded to reconnect McColo to the Internet. Srizbi, Rustock, and Ozdok made a brief reappearance lasting an hour during which all three of these BOTs generated about 10% of the CBL detections that they would normally do over a 24 hour interval. This lasted for less than an hour, and they've resumed previous levels of virtually no detections at all.

When informed of the situation, Telia responded quickly and disconnected McColo again.

It's obvious that the entity-that-is-McColo will continue to try to get new connectivity. But even if they don't, the BOTNETs will probably be back.

2008-11-20: The (partial) Resurrection

As of approximately 10:00 hours GMT this morning, the Asprox BOTNET began resurrecting itself. The CBL is detecting them at a rate very comparable to pre-McColo outage levels. So, one must assume that Asprox is now back in full operation for the time being.

Sometime, also today, Ozdok/Mega-D began reappearing. We don't have a good handle on where it's going yet, but going by appearances, it will be above the level we saw before the McColo outage.

The CBL is also seeing a surge in "general behaviour" detection rates. The McColo outage seemed to provoke an overall CBL detection drop of 50% as we described above. Over the past 12 hours, the CBL detection rate has regained half of that drop, and has reached 750,000 unique IP address detections over the past 24 hours.

It's too soon to tell what effect this will have on measureable spam volumes. One of our data partners has reported spam volume at pre-McColo outage levels. Others have reported a surge in volumes, but not (yet) close to pre-McColo outage levels. We haven't yet seen public reports showing a compelling case for an increase. In a day or two, we'll know for sure.

The resurrected Asprox seems weak in the "spam sending" department, but Ozdok looks like it's really angry for some reason...

We'll see about that.

Still no sign of Srizbi or Rustock rebirth.

2008-11-25: The (still partial) Resurrection

Asprox, Rustock and Mega-D have come back with a vengeance in terms of detections and spam volumes.

Cutwail has spiked higher in detections than we ever recall it being, but its volumes still remain a shadow of its former self.

The overall CBL detection rate has climbed about halfway back to the high immediately prior to the McColo disconnection (1M/day), from the low (450K/day) to about 750K/day.

Volumes are up, as you can see here: Senderbase (via Talos), SpamCop and DCC but it's still quite a bit lower than the volumes immediately prior to the McColo disconnection - approximately 40% lower.

2008-12-04: The Plateau

Overall volumes and detection rates seem to be about the same as they were in our previous entry (Nov 25). However, the volume of spam assignable to "named" BOTNETs has finally risen from the pre-McColo maxium of 68% and post-McColo minimum of 34%, of total CBL detected volume to 74%.

Cutwail is still strong - pre-McColo and current detection rates are about the same, but total spam volume is about half.

Mega-D is stronger than before and now is in first place in the spam volume sweepstakes (~20%).

Asprox is fluctuating madly, and is obviously having severe difficulties. C&C hosting problems? Too bad.

Bobax and Warezov have shrunk somewhat from their post-McColo peak.

Srizbi is still not a significant factor - in fact, it seems to have fallen on its face again after a few brief pathetic spikes.

There are a couple of new BOTNETs apparent. It's not altogether clear whether they're just reincarnations of previous BOTNETs or something entirely new. Probably not new. The BOTNET spewing live.com gambling and pharmaceutical links has practically died out - looks like Microsoft is finally getting on the ball in killing the bogus web sites.