Drive-by downloads: exploiting cross-site scripting vulnerabilities


Engineering manager at Twitter (co-founder of Dasient) Neil Daswani and CTO at Cenzic Lars Ewe have a discussion of today’s Internet safety challenges at RSA 2011 Conference – ‘Drive-by downloads: How To Avoid Getting a Cap Popped in Your App’. Their focus areas in this part of the talk are drive-by downloads1 exploiting cross-site scripting2 vulnerabilities on websites, as well as vectors of cyber attacks in the historical context.

Neil Daswani Neil Daswani (web application security expert, engineering manager at Twitter): Hi, my name is Neil Daswani and I am here today with Lars Ewe. We will be talking about drive-by downloads and how to avoid getting a cap popped in your app.

There has been quite a bit of buzz around this talk in a number of publications, and part of that is because one of the things that we’re gonna be doing in this talk is showing how an actual website can get infected by taking advantage of a web application vulnerability, and infecting users when somebody simply loads and visits the web page.

Now, before I kind of get into a detailed demo of the particular vulnerability and the drive-by download that occurs, one of the things that I want to mention is that this particular vulnerability was identified by Gerry Eisenhaur – one of the key researchers at Dasient. He discovered this drive-by on a number of real websites.

What he had found is that the vulnerability was a cross-site scripting vulnerability, a persistent cross-site scripting vulnerability that had existed on multiple websites due to the fact that they were using a third party web application to run part of the site.

Now, we had seen that there were a couple of things going on, so the websites were infected, the third party web application had a vulnerability, it was affecting multiple types of websites, across multiple verticals, including security vendors. And we worked together with multiple parties to lock down the vulnerability but it still exists on some sites.

So as per being responsible software companies and following Responsible Vulnerability Disclosure protocols, we are not gonna be showing the issue on the exact sites on which it was found. And we’re gonna and show a slightly different vulnerability on a mock site that we have made up so that you can see exactly what happens.

But basically it just shows the dangers of web application vulnerabilities. You can have a persistent cross-site scripting vulnerability, and you can use it to inject a drive-by download into a site.

The way that this particular demo is going to work is a ‘benign’ drive-by has been injected into a site via such a vulnerability. And when one visits the web page, we’ve written this so it basically pops up the Windows calculator as an example of what could happen. We could have popped up any application, send any malware binary to the users just when they load the page, but we are just popping up the Windows calculator. And then, there is a bunch of details how the attack occurs, but let me go ahead and switch to the video and show exactly how it occurs.

Sample drive-by download occurrence video
So we have a website at a domain that we put up, called ‘willinglydumb’. And when we enter the URL into the web browser, we will see that the page will load, and it will load its login page but there is a malicious iFrame3 on this page and it basically invokes a vulnerability in the Acrobat Reader process, so you will see in the left side the Internet Explorer window launch Acrobat Reader, and that launched this drive-by download ‘e.exe’, which is the Windows calculator, and that happened very quickly (see video).

So just visiting a web page these days can be very dangerous. And it’s kind of interesting as to how bad things have gotten on the Web. These attacks very often impact legitimate websites. And so on this slide, while we say ‘drive-by via XSS on XXX website’, we are not talking about an X-rated website, we’ve just x’ed out the name.

But one question you might have is: “How did we get here? How did we get to a state of the Web where someone can just load a webpage and have an arbitrary application started on their machine without their permission?” Lars is going to provide us with a bit of historical context and tell us about how we got here.

Lars Ewe Lars Ewe (technology executive in web application development, CTO and VP of Engineering at Cenzic): Let’s do a quick overview of sort of the history behind all of this because, as Neil said, it is actually quite concerning when you see where we are and what the state of affairs is. These days you go to any website, and if you are not savvy enough you might actually get affected without ever knowing that it hit you.

So when you look at the history though – and obviously this is gonna be a very high-level overview here of how we got to where we are – there have been phases here, and we’ve gone through these phases that have changed over time.

And you will see that in the early days, let’s say in the 80’s roughly, we were talking about basic viruses. When they first came out they often would go and attack BIOS level (BIOS level based attacks), they would sit on your drive and the boot sector and often be dormant. Like, you know, Michelangelo – one of the famous ones that people can refer to. Those of you who are old enough to still remember that should know that basically what happened there was it would sit there dormant and then get live at a certain time, which happened to be the birthday of Michelangelo, so that’s where the name came from. And it would wipe out the disk, wipe out sectors of the disk – things of that nature. But when you think about it, the distribution was very limited, it usually happened through floppy disks – a sort of the ‘sneaker network’4 approach, if you will.

Evolution of security
Evolution of security
Then over time, as network became more prevalent, and you had the Internet take over the way we do business, so you had different attack levels. So that happened at the network layer level, and they would distribute them more easily. So we started seeing IDS’s5 and then IPS’s6 to try to protect against that, and Firewalls these days are very common practices, and in the fact it’s pretty hard to find many institutions that don’t run Firewalls or IDS’s and IPS’s of some sorts.

But then after that, in the 90’s and early 2000’s we saw a new attack vector emerge, and that one was really at the application layer. So as we kind of moved down from the infrastructure layer and BIOS layer up to the network layer, we worked ourselves all the way up to the application layer.

Fundamental change in malware distribution
Fundamental change in malware distribution
And the application layer is different. One of the challenges that lies within that is it allows for different distribution. So if you look at how malware has historically been distributed over time, you will notice that, as I said, it started early on with floppy disks and that’s how people got infected. I remember, and you might remember we were cautious of a floppy disk, if you had a concept of being cautious in the first place.

Then over time, we moved more and more to network layer attack, often through email. People got to the point where they realized they don’t want just to open any email, they are cautious of the attachments in emails.

As of late, you now will notice that all of a sudden all it takes is visit a website, and you might never know what hit you until it hits you. So that’s a different layer of sophistication that we are seeing. And also, the way things get exploited has changed. It used to be more of an attack that happened at the client side and often at the OS level, nowadays the root cause happens at the server that you are visiting, so the website itself gets affected, and then consequently that spreads down to yourself. So it’s a different distribution mechanism, different attack level, different sophistication.

Last but not least, I think what Neil stressed before, it’s really important to take away here that now we are talking about legitimate websites. So before, it used to be shady floppy disks. If someone had free software for you, well, you might wanna go double check on that. We did have cases where professional floppy disk companies got affected, but not so often. These days, most of these attacks happen on legitimate websites. So although you might not think at all that going to a certain website might affect you, be cautious of that. And it starts affecting reputations of companies, so it is also a totally different ball game these days.

Ways to infect a site
Ways to infect a site
So with that being said, I am gonna hand that over to Neil again, and Neil is going to walk us a little bit more through the anatomy of malware distribution. The one thing I will quickly cover here is software vulnerabilities (see upper right-hand corner of the image) which often are root cause of where it starts. Most of you might have heard of cross-site scripting, SQL injection7, cross-site request forgery8 – these are all pretty well known, or at least for most people better known than web application vulnerabilities. Especially ones that have to do with injection, have to do very much with being the root cause of what we are talking about here, because they actually open the door for the attacker to then plant the malware on the site that you then later go and drive-by, and that’s where the bad things happen. Well, that’s not the only vector, that’s one of the more prevalent vectors. There are widgets and other things that can affect you as well.

Neil Daswani: As Lars mentioned, software vulnerabilities, web application vulnerabilities are a key root cause of malware getting planted on websites in terms of drive-by downloads. There are several other root causes as well. So for instance, over the past four to five years, what happened as the Web 2.01 transition has occurred is that if you go to a given website, the content on that website is coming from a lot of places, not just the website itself.

Ways to infect a site
Ways to infect a site
And there are a lot of widgets that are used on the websites. Some widgets are used to render advertising on a website, other widgets are used to provide audience measurement functionality, other widgets provide video playing functionality. In any case, there are tons of these different widgets, and when you visit a website there is content coming from all kinds of places. Now, of course anytime that you have a widget on a website that is rendered by a piece of JavaScript or an iFrame, the content provider or the enterprise business is pretty much giving up part of the control of the website to a third party. If that third party gets compromised, well, so can your website, and your website can be turned into a distribution vehicle for malware.

In fact, there have been specific names that have been attributed to such kinds of cases. For instance, in the case that you have a website and you have an ad widget on your website, which, you know, you may be relying on for revenue and monetizing your website; in the case that your website gets infected via malicious advertisements that come through that ad network, that’s often referred to as malvertising, and malvertising is also being increasingly used to spread malware on websites.

So in any case, attackers can use these techniques to infect a site, or two sites, or three sites, or thousands of sites because of the fact that a particular web application vulnerability might exist, say, in some third party package that is used by lots of websites. Similarly, a particular widget or an ad network might be used by many, many websites, and so when an attacker exploits any of these types of issues it typically allows them to distribute their malware through a network of websites that take advantage of this functionality.

In any case, that’s a little bit about the different vectors. So we will go through the anatomy of the steps that attackers conduct and the steps that take place when a drive-by download occurs.

There are typically 5 steps cyber criminals execute to effectively conduct drive-by downloads.

So, step one in using websites to distribute malware is you have to infect a website. In the demo that we showed earlier, it was a stored cross-site scripting vulnerability that was used to plant some drive-by download code onto a site and infect users. But there are many other ways to do it as well, as I just reviewed with regards to taking advantage of different widgets and whatnot.

The second step is, once the site has been infected, there’s a number of activities that take place online. So when a user loads a webpage that is infected, there is a whole bunch of resources on that web page. The legitimate resources on the web page get rendered, but so do the malicious JavaScripts and iFrames and whatnot that were injected.

Once those JavaScripts start running, one of the first things that the attackers’ code does is it basically fingerprints the user’s browser, figures out what version of the browser they are using, whether it’s IE or Firefox, or Chrome, or Safari. It figures out what all the different third party plug-ins are being used, whether they have Acrobat Reader, Flash, ActiveX installed, etc.

And depending upon that, the attackers have an online exploit database which gets consulted. And what happens is that the client-side vulnerability that is most likely to result in a successful infection is chosen, and the corresponding shellcode2 to, say, take advantage of a buffer overflow or other type of vulnerability on the client is selected and the shellcode is delivered.

Once the shellcode is delivered, the attacker basically has control of the stack on the user’s machine and they take advantage of that capability to download a downloader – a piece of malware whose sole purpose in life is to download more malware. It basically also provides a level of interaction, so that once the attackers compromise a machine they can then decide to download different malware every single day if they like. They could one day download something that conducts fraud; another thing that it can do – and that’s the way it usually happens – is add that machine to a botnet of some sort so that they have further control over it.

Drive-by download steps:

1 – Infecting a site

2 – Invoking a client-side vulnerability

3 – Delivering a shellcode

4 – Sending a downloader

5 – Taking advantage of the infected PC

So once the downloader gets sent, once another piece of malware gets sent, it’s kind of ‘game over’ – the attacker has control over the machine. There have been many types of applications, so the cyber criminals that do this are interested in taking control of users’ machines. But they are also interested in doing things like planting malware on a corporate website because they know employees access that corporate website very often, and it provides them with a great mechanism to take advantage of compromising machines within the enterprise as well.

So there is a whole bunch of different mechanisms but these five steps: infecting the site, invoking the client-side vulnerability, sending the shellcode, sending the downloader and then doing whatever they would like with the machine – are typically the five steps that cyber criminals execute when they want to effectively conduct drive-by downloads.

So that’s a little bit about that process. What I’ll do is I’ll turn it back over to Lars to just start talking about what are some of the things that organizations can do to help protect themselves using a defense-in-depth3 approach.

Lars Ewe: Now that we have talked about what can happen and hopefully convincingly enough proven the point that bad things can happen, very bad things indeed can happen, you wanna ask yourself – what do you do, how do you defend yourself? You know, it’s no fun just to learn how bad the state of defense is, it’s important to learn what your options are.

So the first thing is that you wanna assess your sites. You need to understand that if you are running a website, not only are you responsible for the data within your site – let’s say, you store credit card information or personal information, anything like that, obviously you carry responsibility, and many different complaints and issues will occur if you do not make sure you secure that data correctly, depending on the vector or vertical that you are in.

Lifecycle of malware protection
Lifecycle of malware protection
But just as important as that is the fact that you have responsibility to your user base. You do not want to actually be the one who distributes malware to all your users, and yet no data might get compromised potentially on your site but data on the clients’, on the users’ machines might still be wiped out or it might join the botnet or something like that.

All that being said, that responsibility is yours as the website owner, so do assessments regularly. You have options there as to whose services you want to use, I will encourage you to do some regular antimalware, antivirus scanning – we refer to that often as persistent security.

Every time you do a patch to your site, you might potentially recreate a new security hole. Often the smallest little change in code can result in such thing. New attack vectors come out all the time. Companies like Cenzic update their attack libraries on a very regular basis, in our case on a weekly basis.

So scanning very regularly is one thing. Then, try to prevent once you have findings. Once you know that there are vulnerabilities, try to address them – ideally in code, you have various means there, once the vulnerability is found you then know what specifically you have to do. Either way, at the code level you can also do code reviews, there are best practices you can follow at the code level. You will find that in many cases fixing the code is not an immediate option. If you want to take (and you ought to take) quick preventive steps, then web application Firewalls or any other containment technologies are the next step to look at.

As a matter of fact, you usually wanna do a two-phase approach. First, you immediately wanna put in place a Firewall, you wanna filter the content, and then you wanna also in parallel start looking at the root cause at the code level and fix at the code level, roll out a fix at the code level as well.

So there are many means at that level. After that I’ll hand it over to Neil to talk more about malware aspect of it once you’ve taken care of the web application vulnerability aspect upfront.

Neil Daswani: In addition to making sure that you conduct an assessment of your site, and that can include both a vulnerability as well as malware risk assessment, that can tell you what is the likelihood that your website could get infected by malware, and taking the preventative steps that Lars talked about, it is also important to have detection, containment and recovery services and processes in place.

The reason is there might be a number of vulnerabilities on the site that are preventable. You know, if there is a cross-site scripting issue or an SQL injection issue, you can then put that on track to fixing the code and deploying it out on the site.

At the same time, if you have a third party partner that, say, provides you with an audience measurement widget that you are using, or a video playing widget, or you are dedicating part of your website space to advertising, then those widgets at any time could get infected and there may not be anything that you can directly do to prevent it.

But if you don’t take steps to detect this kind of activity, then you can end up with a scenario in which ads are being shown on your website, and basically when, say, a search engine crawler comes by, it may identify the fact that your site served a page and there happened to be an ad on it and it ended up sending a malware drive-by download or a fake antivirus package to the user. And if that occurs, then your website can get blacklisted by the major search engines, it can get blacklisted by the browsers, so it is important.

Your website can get blacklisted by the major search engines and browsers due to drive-by downloads.

If this is happening, even if the drive-by download wasn’t coming from any of the direct resources on your website, but was coming from one of your widgets, it’s important to have this kind of detection and containment in place.

So what you need in terms of detection is some kind of anti-malware monitoring. There is a number of organizations that can provide web anti-malware services. Basically, it typically encompasses providing your domain name and having all the URL’s on your domain enumerated, crawled and scanned using ideally a behavioral oriented algorithm that fully renders all the content and ads and all the other aspects of your site to identify these kinds of issues.

One great thing about the detection is when these kinds of issues occur, you get emailed, you get alert sent to your cell phone so that you can help lock down the issue and take appropriate steps. If you are interested in automated containment, there are also open source modules like ‘Mod Anti-Malware’ which can be downloaded and installed on your web server so that if any issue occurs pages on your site can continue being served but the malicious code automatically gets stripped out.

And then finally, to address the issue completely, the malicious code that might have been injected into some file, that might have been injected into some database – that needs to be removed. So once you employ all of these five steps: assessment, prevention, detection, containment and recovery, using some of the kinds of mechanisms we’ve talked about you can make sure that your website has a holistic process in place to address web security and security from malicious software.

Risk tolerance
Risk tolerance
So now that we’ve covered the existing threats and some of the things that you can do to address these issues, we’re gonna talk a little bit about the future – where do you go from here? Depending upon what kind of organization you are, this may have different implications for you. There is a pyramid here on this next slide on risk tolerance (see image) where we’ve seen that there are different kinds of websites and different kinds of ad networks, and they have different levels of sensitivity.

So some sites require mission critical security. Military websites may for instance fall within that category. Then there are sites for which security is extremely important: you can think here about websites that might be providing an email service so you can log in and check your email. So it is important that they have security at the appropriate level. And then there is a whole bunch of other websites out there. You know, my dentist, my gardener – they all have websites, and for them it is important to have some adequate level of security. And so there are these different levels of criticality, and there are different levels of protection that are required.

So, in the mission critical section you might have military websites, you might have websites of financial institutions. For them, it may be important to have on-premise software1, and both Cenzic and Dasient have provided on-premise software to address these kinds of issues.

Managed service for both mission critical and important security websites are provided by both such companies, and an example of that is monitoring for instance. These two areas have been pretty well covered but what we’ve been finding is that there may not be enough out there for adequate security for a large number of small and medium business websites. So Lars and myself together with our teams at Cenzic and Dasient are now providing integrated vulnerability and web malware scanning from the cloud in a low cost way for this large range of websites. We are working together with a number of large web hosting providers to basically provide that level of managed in-the-cloud security to their customers.

There is a whole bunch of good reasons to do this because it is good for the web hosting providers themselves. They may have their reputation impacted if there are a lot of infected websites on their platform. And there’s a lot of small and medium size businesses that basically get hit significantly and cannot monetize their site anymore because of these kinds of issues. So together with the partnership of various web hosting providers, we are excited about helping further secure the world. So with that I am gonna turn it over to Lars, and he is gonna talk a little bit about transitioning the amount of risk over time for sites.

Lars Ewe: One of the challenges that we see with many organizations which are our customers that we work with closely is the sheer amount of sites that they have. They often go through this process and they will find out that first challenge is that they don’t know how many sites they do have. So just taking their inventory turns out to be challenging for many. So once they have the inventory, there is a sort of negative surprise effect of that, which is how many they have. And as they start to digest that, they realize quickly that they really don’t have the resources to address more than a small fraction of those.

Website risk management
Website risk management
With the new offerings that we are talking about, there are better means now for these organizations to take a broader and an in-depth approach at the same time. We often refer to that as a sort of the funnel (see image). And the idea of that is no different than health check when you go to the doctor. You usually first get a quick health check, they take your blood pressure, maybe blood sample – no more than that. And then, based on the findings of that they will decide what further steps need to be taken. This isn’t any different. Think of a solution approach that scales broadly so that we can actually scan thousands and thousands of sites for you on a very regular basis. And then, based on the risk metrics that come out of that process we will determine which applications need to go through a deeper process and through more inspection if you will.

So it is really a risk management environment at the end of the day that funnels your applications based on various criteria through the funnel all the way down to the most critical, most robust applications that obviously receive a slightly different treatment than the ones at the top of the funnel. The idea here is to do that jointly, the idea is to do that both at the web application vulnerability level as well as at the malware level. You will see these offerings to be very scalable, you will see them to be very configurable, you will see them to be able to actually help you work across your entire portfolio of web assets.