Quantcast

Searching for Malware: Essence and Methodology of the Research

David Maynor and Paul Judge with Barracuda Labs give a Defcon presentation reflecting their research on malware distributed via online search resources.

Paul Judge Dr. Paul Q. Judge (Chief Research Officer and VP at Barracuda Networks): Good afternoon, thanks for joining us for this session. I am Paul Judge, this is David Maynor. What we want to spend some time on today is search malware, we’ll also share with you some of the results that we’ve seen. For probably the last several months, we have been looking into this issue. Certainly, over the last year, there have been many examples of malware poisoning our popular search terms. And we’ve all seen examples over the last year.

Our goal was to really kind of understand how much this is happening, understand where it’s happening, understand a little bit more about how it’s happening.

And so, as we dug into this, we realized that one of the things here that was pretty obvious is kind of – why the attackers are focused on the search engines.

Search volumes

Search volumes

The number of eyeballs that are showing up on the search engines every day is growing rapidly. I mean, if you look at the latest numbers from the search engines, if you look at Microsoft, look at Yahoo, even Twitter now, and Google – there are hundreds of millions of searches done on each one of these every day. Microsoft totals in at over 4 billion searches a month, Yahoo! at over 9 billion, Twitter now is claiming over 24 billion, and Google leads with over 80 billion searches a month.

So the point is, as more information, more users come online, we all use search engines more and more. I know, I have personally come to the point that I am so lazy that even when I know I’m going to the site like CNN, instead of just typing cnn.com, I don’t have time to type the actual 4 characters, so I just put it into my search toolbar and let it to work. I see a lot of heads in the audience, so a lot of people have kind of developed that habit.

So the point is that there are so many people going to search engines every day, and the attackers realize that this is a pretty good place to focus to get a lot of eyeballs.

And so, what we wanted to do was understand how they are targeting particular terms, how much they are targeting particular terms, whether there are particular categories that are more popular, and so forth.

Search engines crawling methodology

Search engines crawling methodology

So with that, we set up a system, a methodology to crawl different search engines. And it actually around the clock pulled and looked at what the most popular search terms were, for Yahoo, for Bing, for Twitter and for Google. And we looked at those most popular search terms around the clock and looked what were the search results for those. And so we pulled those search results, and then actually pulled the pages that those were pointing to, and analyzed them.

Frequency of search engine malware

Frequency of search engine malware

So what we looked at was 4 different search engines over about 2 months, 57 days to be precise, and in that timeframe over 25,752 popular topics that we examined, and then over 5 million actual search results.

So we’re gonna dig into what we found. So, you know, one of the first points is – we found malware. Anybody surprised by that? Over 8000 examples of malware across the different engines (see histogram).

Total malware by search engine

Total malware by search engine

And if you look at the breakdown, the leader is Google, with 69% of the malware that we found being found on results from the Google search engine. After that was Yahoo, with 18%; and Bing, with 12% (see diagram). This is one of the first times in my life Microsoft actually has this advantage of having a little market share, so kind of being the least attacked platform. The other one you see there is Twitter, with 1%. And at first, it was a little strange to us because we are pretty familiar with how much malicious activity misuse was happening on Twitter. But one of the things we understood as we dug into this is as follows: well, think about how search engine works, it actually organizes and ranks results; and so it makes it pretty easy for an attacker to use search engine optimization to actually make sure they are in that set of results that a user gets.

Whereas with Twitter, for most of the time, the way their searches work is they just give you a snapshot of who’s talking about this right now. So there is no ranking, there is no prioritization, and for an attacker to try to poison a search engine, they’re gonna make sure they get their attacks in the top. For Twitter, they are more kind of playing the odds and seeing where they end up in that random string. And so that’s what explains it’s only 1% number. But as we get further along, we’re gonna show some examples of specifically the type of things that are happening inside of Twitter.

While analyzing the daily activity for each of the engines, we saw that everyday Google led the pack; on different days, different engines had a little bit more, or less. But what becomes interesting is that pretty much every day each engine led to something malicious. You know, no engine really took a day off.

Malware captured by day of week

Malware captured by day of week

But if we look at the different days of the week, one of the first things that we want look at is, if you look across the week, are there particular days that have more activity (see histogram). The short answer is ‘No’, not really. Tuesday led a little bit, representing about 16.7% of the overall malware for the week. But there wasn’t a strong correlation about the day of the week.

Malware captured by time of day

Malware captured by time of day

But this got a little interesting if we looked at the time of the day (see histogram). These time periods are based on an Eastern Standard Time. If you look at the 11 PM to 5 AM interval, really over 50% of the malware was found in this cycle. So what we’ve seen is that in the 11 PM – 5 AM slot, over 50% of malware was showing up, so if you think about this engine running around the clock, pulling in the popular search terms, pulling in the results, and then analyzing those, we had over 50% of activity in that 6 hour block, that was the end of the night.

Read next: Searching for Malware 2: Prevalent Patterns of Malware Distribution

Like This Article? Let Others Know!
Related Articles:

Leave a comment:

Your email address will not be published. Required fields are marked *

Comment via Facebook: