Data Mining a Mountain of Zero Day Vulnerabilities 3: Vulnerabilities by Language, Supplier and Industry

Additional application vulnerability metrics provided and explained by Chris Wysopal in this part are programming languages, supplier types, and industry.

Vulnerabilities by Language

So, next I want to take a look at this by language because the language you program in makes a big difference in the kind of vulnerabilities the application is going to have.

It is both sort of at a raw language level and in the types of APIs – whether it’s runtime or a framework that’s used to program in that language, that either supports writing secure code or doesn’t support writing secure code. And it’s also how the developers who are trained to write in these different languages can either support or not support secure programming.

Top 3 vulnerabilities by language (Java, ColdFusion, C/C++)

Top 3 vulnerabilities by language (Java, ColdFusion, C/C++)

So, if we look at Java, we have cross-site scripting, CRLF injection, and information leakage. It seems fairly safe, because the top categories aren’t all that exploitable or they’re being exploited.

But if you look at ColdFusion, its second biggest category is SQL injection at 8%, so, you know, ColdFusion doesn’t really support a coding style, or the developers aren’t trained in such a way that they’re able to prevent SQL injection the way that Java developers are. So, what this tells me is if I run into a ColdFusion application, I’m much more likely to find SQL injection than if I run into a Java application. There are more Java applications out there though, but still, it shows that the language really does make a difference here.

And if we look down here at C and C++ we can see that buffer overflow and buffer management errors are number 2 and number 3. So, if you have that type of compiled C and C++, we all know that those are really endemic problems with programs written in those languages.

Top 3 vulnerabilities by language (.NET, PHP, Android)

Top 3 vulnerabilities by language (.NET, PHP, Android)

.NET looks a lot like Java was, with cross-site scripting and information leakage, but cryptographic issue shows up where it wasn’t on Java: you see that with Java number 3 is information leakage. So it tells me that .NET isn’t supporting writing cryptography as well, or the developers aren’t trained as well in cryptography.

PHP, again, is sort of like ColdFusion, it’s not supporting writing code that doesn’t have SQL injection in it as well as .NET and Java are. So, again, if you have a PHP application, you’re more likely to have SQL injection. And you can see here that directory traversal shows up, and directory traversal isn’t the top 3 of the other languages, and we get directory traversal showing up in PHP. So PHP, just from a web standpoint, looks pretty bad, right? It looks like we got some too serious categories here in addition to cross-site scripting, which seems to be everywhere. We have SQL injection and directory traversal. So, if anyone’s done auditing of PHP apps, they know that these issues are probably more likely with PHP apps.

And then I threw in the Android, even though we have a sample of only about 100 Android apps to show. What are we finding in the Android? And it actually ends up having a lot of cryptographic issues, which is pretty interesting. It looks like Android developers don’t understand how to use the crypto API as well on that platform, and they’re also baking in a lot of static crypto keys, which is definitely a bad idea.

Vulnerability Distribution by Supplier

Application vulnerabilities by supplier

Application vulnerabilities by supplier

A lot of data here (see image), this is looking at the vulnerability distribution by the supplier type of the applications. Supplier type is who developed it: was it internally developed, was it commercial code, was it open source, was it outsourced?

And if you look across something like cross-site scripting – and take the outsourced stuff with a little grain of salt here because it’s a small sample size – you see this is pretty consistent. Everyone’s writing cross-site scripting errors; that’s really consistent.

SQL injection ends up actually being pretty consistent too: we got 4% for ‘Internally Developed’, 3% for ‘Commercial’, 3% for ‘Open Source’ . So, based on supplier type it looks like those are very consistent.

But if you look at directory traversal, it goes from 3% ‘Internally Developed’ to 6% ‘Commercial’ to 13% ‘Open Source’. So there’s definitely a difference with directory traversal. I have no idea why that might be; I haven’t been able to figure that one out. If anyone has any ideas why that would change, and it looks like significant numbers, I don’t know why that might be.

But this one is another thing that is different, it’s commercial code: we see buffer management errors and buffer overflow. And I know why that’s not in the other categories – that’s because they write their code in C and C++.

Vulnerability Distribution by Industry

Application vulnerabilities by industry

Application vulnerabilities by industry

And then we sliced the data by industry, so this is, you know, who is going to be operating the code, whether they built it themselves or they’re purchasing it. Who’s operating the code? And this ‘Government’ category is all US government; we don’t have any foreign government customers; for some reason they don’t want to send the code to the US – because of the USA Patriot Act, I think. But the government doesn’t mind, because they already have the code.

And then, finance and software were two other industry verticals that we had a significant amount of data in. You know, we do have data from retail, manufacturing, healthcare. But the sample size for those was less than a hundred apps, so I didn’t want to use that data. But for these we have thousands of apps in each of these categories.

And you can see that there is a significant difference with cross-site scripting: government – 75% of apps were affected, finance – 67%, and software industry – 55%. I think in the case with software industry it’s probably because they’re writing a lot of non-web apps, that’s probably part of the reason; but it’s interesting that finance is sort of doing a little better job writing their code to not have cross-site scripting.

SQL injection goes from 40 to 29 and 30 by industry, so government is writing code with a lot more SQL injection issues in it, they should definitely change something there. That looks like a pretty high number. So, if you’re going to attack government websites, I probably didn’t have to tell you this, but you’ll find SQL injection in there.

Read previous: Data Mining a Mountain of Zero Day Vulnerabilities 2: Top Vulnerability Categories
Read next: Data Mining a Mountain of Zero Day Vulnerabilities 4: Distribution Trends over Time

Like This Article? Let Others Know!
Related Articles:

Leave a comment:

Your email address will not be published. Required fields are marked *

Comment via Facebook: