Engin Kirda, the co-founder of Lastline Labs, took the floor at Black Hat USA to give a retrospective view of ransomware and analyze its present-day flaws.
Hi! Good afternoon everyone. Thanks for showing up. I have the pleasure of having the last session. Hopefully it’s not the curse of having the last session. So, briefly about my background. I’m a computer science professor at Northeastern University in Boston. I’ve been doing malware research for the last 10 years or so, and I have built some popular malware analysis systems like Anubis, EXPOSURE and Wepawet that some of you have maybe used in the past. And I’m also one of the co-founders of Lastline that does zero-day threat protection, so we work on malware. And Lastline Labs is actually our research arm.
This work is partially based on a study that my Ph.D. student Amin Kharraz actually worked on and I published, with some co-authors, at a conference called DIMVA 2015 (see right-hand image). There’s a scientific paper that goes with this presentation. If you google for “Cutting the Gordian Knot: A Look Under the Hood of Ransomware Attacks”, Google is going to spit out the PDF, and if you are interested in the technical details I would refer you to that paper. There’s definitely more information there. I have a short session, that’s why it’s going to be a short talk.
So some key takeaways from this presentation. The majority of ransomware actually launches relatively straightforward attack payloads. When I say that most ransomware isn’t complex, some people are going to find that provocative. They are going to say “No, but ransomware is malware, you know, they do all these things, they are actually complex.” So we are going to look at some examples, we are going to look at the big picture, and my aim is to set the problem in perspective to show you that not all of them actually are as complex.
In many cases, we are actually seeing relatively straightforward attacks (see right-hand image), and I think there is hope. We should be able to do things to actually detect some of these attacks more efficiently. These are things like using bad cryptography, or the use of standard cryptography libraries, which is something that we might actually use to detect ransomware. Or sometimes files are deleted but they are not wiped off disk, right? So you might actually have the chance to recover the data. Not all ransomware is actually equal.
Compared to other types of malware, ransomware actually has very distinct, predictable behavior. We are going to go through some examples. Ransomware is a specific type of malware, but it does things that are quite unique to ransomware. These are things like ransom notes with background activity, background noise; changes in the entropy of files – when things are being encrypted the entropy of files changes; iteration over large numbers of files. So these are typically things you might not see in other malware or in benign software. Hopefully we should be able to use these things to detect ransomware more effectively.
So what are we going to discuss? Well, the significance of the ransomware threat (see right-hand image). Definitely, it is a threat. I’m not saying that it’s not a threat. But not all threats are very complicated, although they might be successful. We are going to look at the complexity and sophistication of attacks. So what do you mean by complexity? And why do most people, when they hear of ransomware, think it’s actually very complex. What are the attack mechanisms we actually see out there if you look at ransomware at a large scale?
And what are the main ransomware weaknesses? They do certain things, but can we actually use these weaknesses to be able to detect ransomware more effectively? Can we develop technologies that actually use these weaknesses to detect ransomware? And also I’ll be talking about better mitigation, so my aim is to hopefully close Black Hat with a positive message. Not all is lost, and we should be able to do a better job of at least detecting things like ransomware.
Just to recap so that everybody knows ransomware (see right-hand image). What are the typical behaviors that we see in a typical ransomware attack? Well, of course, the victim machine would be compromised, then the ransomware would be installed. Once the attack payload is executed – if there is an attack payload – the ransomware would inform the victim of the attack. Compared to other types of malware, this is actually quite distinct behavior? Something bad happens to you, you get infected, and the ransomware actually tells you that you have been infected, right? You don’t always have this luxury in other types of malware. Ransomware actually tells you that you’ve been infected.
The victim would need to pay up, of course, otherwise the data would be kept hostage or it would be destroyed. Any malware that actually fits this category today we actually say is ransomware. And you’ve been reading a lot in media about this, because we’ve been seeing ransomware and people are being attacked.
Classic ransom notes would be something like that (see right-hand image). It is social engineering, of course. It looks like it’s coming from the NSA, FBI and all these organizations. At the same time, it’s also the PRISM system, right? And the attackers are social engineering the victim into believing that the victim has been caught hosting illegal content. And they say if you don’t pay up you are going to be arrested, the government is going to come after you. And many people, especially end users, are technically not sophisticated and they fall for these scams.
One interesting signal is that there have been cases where the bad guys, who are actually hosting illegal content, fell for these social engineering scams and they gave themselves up, they went to the police. Maybe that’s one good thing that ransomware has done once in a million years.
Here’s another example (see right-hand image). Again, these all look quite similar: “Your computer has been locked!” It’s supposedly from the FBI. You have to pay up in three days otherwise you are going to be arrested. And many people actually fall for these things, and that’s why ransomware is effective. But it’s not too different from other types of malware that you see, for example fake AV, where you think that you’re buying an AV product but it’s actually a fake AV product.
So how has ransomware evolved over the years? Well, the ransomware concept actually dates back to the end of the 80s – the beginning of the 90s, right? People came up with this idea. It has been around for a long time, but it has been rediscovered. Clearly, ransomware attacks have actually increased in numbers over the last five years. We’ve been seeing more and more of ransomware. Some of them are more sophisticated than easier variants. Damages are being reported. And it’s interesting, people like this idea of encryption, deletion, especially encryption – it’s magical. That’s why there are many reports, and a typical end user thinks that ransomware is a very-very complicated thing.
Also, this is fueled by many security reports that talk about the sophistication and the complexity of individual attacks. Some reports might say “We just saw this example, this sample does encryption in a very-very good way, and we cannot recover the data because the encryption is sound.” Reports like that create the general impression in the public that we are faced with a new threat that is very difficult or that’s impossible to prevent. Because if the information has been encrypted in a very-very good way, then we cannot decrypt it if we don’t have the key.
There is truth in that. Some attacks are effective, even simple ones are effective. Here is an announcement from the FBI (see right-hand image) that actually reports that many people were victims of Cryptowall, many people ended up actually paying money to Cryptowall, and $18 million were lost (see left-hand image).
So there is damage, right? But the question then is, if you look at the code, if you analyze the attack, how much sophistication are we actually seeing there? Is this another type of Stuxnet, or are we dealing with common behaviors that you also see in other malware? Is ransomware a lot different than other types of malware that we see out there?
Not only end users, of course, are victims of such attacks. Organizations generally are well protected, so a typical company is not going to be scared of ransomware, because they are going to have good backup policies hopefully, or they are going to have systems that are more effective against malware. But there are smaller organizations. This (see right-hand image) is an example from a small town in Massachusetts, where the police ended up paying the ransom because their machines were attacked and they could not recover the data. But why is that happening? Is it happening because the ransomware is very complicated? Or is it happening because the organization was ill-prepared and didn’t have the right defenses or the right security policies? So in this case, yes, ransomware attacked them, but any other type of malware could have also attacked them.
So what is complexity and sophistication when we talk about this? Everybody says it’s very complex and very sophisticated. A typical way of measuring ransomware sophistication is to look at the code and to look at evasion. Any person is going to say, oh yeah, this is very complicated because you are having a tough time detecting it. Because you are having a tough time detecting it, it must be complicated, it has a certain sophistication. So we look at evasion, we look at things like packing, we look at dynamic checks. We look at encryption, right? Are they using a good encryption algorithm, or are they using a weak one? So these are the type of things that we actually look at, and then we say it’s actually a complex attack.
Evasion, of course, is not something that you only see in ransomware. You also see it in other types of malware, so it’s actually common behavior, right? It’s not unique to ransomware. So we look at things like stalling against the analysis environment or self-modifying code that adapts itself. In this work we are actually looking at the sophistication of the attack after compromise. So we are going to look at what ransomware actually does, if we look at the big picture. Of course there are samples that do more nasty things than others, but how complicated are they and how complex are these attacks? And then you can make up your mind about it.
To be able to do this, we collected some samples, we took a historical look at ransomware (see right-hand image). We looked at samples from 2006 to 2014. We looked at more than 1300 samples from 15 families, including modern families like Cryptolocker and Cryptowall. We did this by crawling the web, looking at public repositories, getting some data from Lastline as well. And we analyzed these files and tried to gain some insight into what we saw in the past and what we are actually seeing today.
We did automated dynamic analysis for all the samples (see left-hand image). In some cases, after running a sample and if there were issues, if we thought it was necessary we did manual analysis too. So one challenge here is, if you are looking at any malware sample, how do you actually know that it belongs to that family? How do you know it’s Cryptolocker or Cryptowall, etc.? So our methodology was that we cross-checked with VirusTotal, and if three or more scanners actually agreed on the sample and gave it the same name, we said, okay, this looks like a sample from Cryptolocker, Cryptowall, etc. So we created a labeled data set. And all the samples we actually looked at showed some ransomware behavior.
So what are the attack payloads? Encryption, of course, is a popular thing. About 5% of the samples that we actually looked at were using some sort of encryption. It’s generally well-known that older samples used to implement this encryption themselves, so basically the malware authors decided that they wanted to implement this. So there is custom encryption, which actually leads to mistakes. There have been many reports where you actually look at the encryption, it’s actually not implemented well, or the attackers make mistakes like leaving keys on disk. And you can actually use these things to decrypt the data. This is something that used to happen quite a bit in the past. It was not perfect.
Recently we have seen more samples that actually use standard libraries (see right-hand image). Current popular families like Crypt0l0cker and Cryptowall use the Windows crypto libraries. These are standard libraries, and they are being used because they are standard libraries. They do the crypto, they do the software engineering, so if something is encrypted with them, of course it’s difficult to decrypt it, especially if you don’t have the key.
The question, though, is whether this is sophistication or just good software engineering. Every good software engineer knows that you don’t implement encryption algorithms yourself. You use things that other people have implemented and verified. So this is actually good software engineering. The fact that they are using standard libraries is good practice. It just makes their product more stable, but it’s not that sophisticated in the sense that they are using standard libraries.
Using strong crypto libraries is actually a double-edged sword for the attackers. On the one hand, they can actually create damage that’s irreversible, that’s tough for us. But at the same time, if you are doing dynamic analysis, if you are looking at this piece of code before it reaches the end users, it gives you the chance to catch certain things such as the use of these libraries. So if you see that something suspicious is using crypto libraries, maybe you can actually use that against that sample, and you’ll know that you are potentially dealing with a ransomware sample.
What are the deletion mechanisms? About 36% of the five most common ransomware families in the data set were deleting files (see right-hand image). If you didn’t pay up, the files were actually being deleted. Most of the deletion, in fact, was quite straightforward. How would a professional person do this? They would actually aim to wipe the disk so that it’s difficult to recover the data. You would write over the disk, you would wipe that file off the disk. But most of them were, of course, lazy, and they were directly working on the Master File Table entries and marking things as deleted, but the data was still remaining on disk.
So yes, it was deleted, but you had the potential and the possibility of recovering the data in some of these samples, because it wasn’t actually completely corrupted. So recovery was actually possible in many cases. Just because you are hit by ransomware doesn’t mean that you cannot recover the data. In some cases, if it’s a simpler variant you might be able to recover the data, too.
But at the same time, this use of MFT could be an effective venue for detecting ransomware during analysis. If you see that a lot of MFT activity is being done, maybe bulk deletions are being performed during the analysis that might give you the chance to say, hey, this looks suspicious, benign software wouldn’t typically do this. Combining it with other things that you see maybe would give you the possibility of detecting ransomware more effectively.
Many samples, of course, also lock the desktop (see right-hand image). That’s another classic ransomware behavior. Rather than deleting things, rather than encrypting them, in the course of a simple attack you basically lock the desktop and you keep the users out of the desktop. This is classic ransomware behavior. More than 60% of the samples that we looked at – if you think of this historically, this was a very popular thing in the past – use things like CreateDesktop to create a persistent new desktop, and you just lock the user out of the machine. And because the end user is not very technically sophisticated, they think they can’t recover the data. But in reality, the data is still there. So you do have a chance to recover the data.
Another approach is to display HTML pages and disable certain components so that it’s actually difficult for the user to get rid of that. Again, they are locked and they can’t access their machines. In all cases, though, a message was displayed to the victim. So remember the ransom notes that I showed at the beginning. If you think about it, they have very similar messages. These messages are quite similar, so we should be able to do some sort of analysis as well to be able to detect these types of attacks, especially if something is locking the desktop and displaying it. In a way, it’s not that complicated also from a defense point of view, because there are things we can do automatically to be able to detect this type of an attack.
Locking mechanisms are also typically a nuisance, but the data is typically not harmed. So the data is still there, and if you are a more technically sophisticated user you can basically unlock your desktop without paying up. Of course, what you are seeing is that if there are simple things, the attackers take the simple route because you still make the money, you don’t have to be super-complicated. This is something that we also see in other types of malware, not just in ransomware. They take the easiest path.
What about better mitigation, though? Can we actually do things to mitigate this threat in a better way? The Achilles’ heel of ransomware, actually, is that the ransomware has to inform the victim that the attack has taken place. This is actually quite unique to ransomware, and this is something that you don’t see in other types of malware. If you look at nasty malware, they try to remain hidden, they try to remain stealthy. They might inject themselves into other processes, because if you do a “ps” you wouldn’t see it running, right? They try to hide the activity. And they try to remain hidden over longer periods of time as they are doing bad, malicious activity in the background.
Compared to that, ransomware gives us the advantage by actually informing us, right? So it tells us “Hey, you’ve been attacked, I’ve just encrypted all of your files.” So from a defense point of view, if you are analyzing something before it reaches the end users, it gives you that advantage (see right-hand image), it tells you “I am bad and I’ve actually infected you.” So this behavior is actually inherent in ransomware’s nature, this is something that is difficult to change because this is how money is made with ransomware.
There are certain behaviors that are predictable, things like entropy changes. When you are actually encrypting a file, the entropy of that file changes, which is one way that you can detect that the file has been encrypted. Or if a modal dialog is displayed and there is some background activity at the same time, this is something that you can use to detect suspicious ransomware-like behavior.
Or accessing “honey” files – what if you are doing dynamic analysis and you put these files that look like interesting files, but you have created them and suddenly you see that they are being touched and their entropy has changed, so somebody is encrypting them? This is behavior that you would not see in a benign sample, but this is something that ransomware would do. So behaviors, actually, should be predictable and it should be detectable. We should be doing a better job in dynamic analysis systems in catching things like that.
Let’s look at an example (see right-hand image). If you look at Cryptolocker and you run it in a dynamic analysis environment, the types of behavior you would see would be things like the autostart is modified. Why? Because the attacker is trying to make this persistent. Every time you reboot, the same things is displayed. You would see things like memory activities, you would see network activities. So all sorts of things you would also see in other malware samples would be displayed typically by ransomware, too.
But one thing that you would see, potentially, would be evasion attempts. They might be checking for specific image filenames, or they might be checking for certain module names that typically get automatically loaded into sandboxes. Or they might look for certain indications that the code is running in a sandbox to be able to detect this. So yes, evasion is a possibility, and this is what we actually often say is complexity, but this is not unique to ransomware.
What is unique to ransomware is that you might see file activity, so you might see that this thing is searching for files across mounted drives. Or you might see that it’s iterating over directories. So you run it, there is some evasion, but at the same time it’s doing ransomware-like things, because it’s trying to encrypt all the files – very suspicious.
If you look at the report, other things you might be able to see would be loaded libraries (see right-hand image). You see a bunch of libraries that are loaded: there are standard ones, there are kernel operations. But one thing that actually pops up is that it’s using cryptbase.dll. And what is that? That is actually Microsoft’s base cryptographic API DLL. This by itself doesn’t actually tell you that you are dealing with a malicious sample, because benign software could also use this. But when you start combining these behaviors – you have cryptography usage, you have something that’s actually searching over many files, in some cases it might be displaying something and doing this in the background – we should actually be doing a good job detecting these things, and these behaviors are not too complex. These are actually things that we should be able to catch.
So the key takeaways (see right-hand image). Again, the majority of ransomware actually launches relatively straightforward attack payloads. If you look at the attack that is taking place, yes, encryption happens and the encryption might be good. But if you are doing some analysis, maybe at a gateway or something, before this thing actually hits the users, you should be able to catch things like the use of standard cryptography libraries. And even after the infection, in some cases, bad crypto is used, there are many examples of that; or things like files are deleted but they are not wiped off disk.
So, again, typically malware authors take the easiest path and they don’t always put a lot of complexity into whatever they create. So it’s not always Stuxnet that we’re talking about that actually shows a lot of real complexity.
Compared to other types of malware that tries to remain stealthy and in the background, ransomware actually has very distinct, predictable behavior. It does these things, and we know that it does these things – things like ransom notes with background behavior. So you might be able to do OCR and look at these images automatically, trying to extract certain words, and then use them during your detection phase to detect malicious code like ransomware. Or the changes of the entropy of files – this is actually difficult to hide. So if you see that happening you know that the file is being encrypted, right? Things like that should give you indication that should allow you to catch ransomware with better technology. These could also be things like the iteration over large numbers of files.
Thanks so much!