Ben Hagen, an acclaimed security consultant from the US who ran Application Security for the Obama re-election campaign, delivers a talk at the 29th Chaos Communication Congress event to share his insider’s view of the recent Presidential Election campaigns from a security perspective.
I’ve been based out of Chicago for about 8 or 9 years, I think. Chicago is a pretty interesting place to be in technology. The technology community there is pretty small, we all kind of know each other, and after doing about 3 or 4 years of consulting, I was having dinner with a bunch of friends in technology in Chicago, one of whom had recently started as the CTO for the Obama 2012 re-election campaign.
Naively, in response I said that I thought I could help them with that. And a couple of months afterwards I joined the campaign as a Senior Application Security Engineer, which is a little misleading, because I was the only application security person, or the only full-time security person within the entire campaign.
I also helped out with the enterprise security, which is sort of the traditional IT role that you see in the organization: headquarters networking, email, secure architecture of our field office networks, and that kind of thing. I helped deploy the IDS (Intrusion Detection System) within our headquarters, did monitoring on it, did training of people, helped people understand what security meant within the organization, acting as a general resource for security.
That was a pretty big win, and technology played kind of a groundbreaking role in this election. This was the first time we saw a lot of donations happening online, we saw a lot of online communities spread up in support of the President. He had a pretty sophisticated homepage that allowed people to discuss and comment on different news articles and communicate that way.
They also had their share of security problems, and some of them weren’t very serious, but garnered really big headlines. And I think that leads to one of the problems that you have in political campaigns: no matter how small the problem is, it’s going to cause headlines everywhere.
So, pretty small, pretty normal, kind of run-of-the-mill cross-site scripting bug, stored cross-site scripting, that redirected people because of a bug in the commenting system, basically. And, you know, cross-site scripting happens every day, but you don’t really see headlines in major newspapers regarding those issues on normal websites. We did have more serious issues.
I think this time around we kind of stepped up the technology game also, where we took what was done in 2008 and kind of did more of everything. We built a lot more applications; donations were a much bigger part of it, online communication was a much bigger part of it as well as advertising that whole thing.
Online donations in 2008 amounted to about 500 million; that jumped, again, in 2012 to 690 million, and that was a really big achievement, mostly because it was really questionable how people would approach Obama this time around. The country was in economic turmoil, lots of people were unemployed, it was really hard to know how things would turn out. Total donations in the 2012 campaign exceeded 1 billion dollars, so that’s a lot of money. And that makes us a threat; or not, it makes threats a very big part of what we’re doing with technology.
Basically, we had deployments in several different availability zones; those are different data centers. All of these things are interconnected, lots of very interesting applications. One that was most talked about is called Narwhal; it’s kind of a big data backend system that collects data, normalizes it, does processing, does interesting analytics, and acts as an API for some of the other applications within the campaign.
We also developed what we called Online Field Office, the Dashboard: essentially, it’s a fully independent social network that had a lot of the same capabilities you would see in Facebook or any other social media application, but it was meant to help people organize online and communicate the campaign goals and activities through people according to region, neighborhood or interest; you could kind of group people and choose how your communications went out. Really interesting stuff.
We also had call tools, which enable you to log onto website, and then, if you’re interested in volunteering for the campaign, you can make phone calls to potential voters and try and talk them into voting for Obama or confirming that they are interested in Obama, or trying to help them vote, that kind of thing. Not everything ended up being super popular, but a lot of these things were very effective, so the call tool, for example, on election day – over a million calls were made just through that one tool. We also had single sign-on services, a bunch of different stuff.
In the nation state range it kind of goes back to 2008, where the systems were compromised probably for the intended purpose of stealing economic or foreign policy information. Nation state actors are kind of that classic, what people call “advanced persistent threat”: basically, motivated opponents who are willing to spend large amounts of resources and time to compromise the system.
Organized crime doesn’t necessarily mean the mafia or something, but kind of the typical criminal resources intent on stealing money, probably. So, at the campaign we had a lot of money: over a billion dollars came in, we took a lot of credit card transactions, we took a lot of donations. We had an online e-commerce store. Obviously, all of that can be a target for typical criminal activity with the intent of stealing money, stealing credit card numbers, that kind of thing. Most of the threats we saw from that were kind of the typical web scanning, attacks on the actual infrastructure itself, SQL injection, that kind of thing. Thankfully, we’re not aware of anything actually working against us, which is great.
Hacktivism, this is kind of the Anonymous threat that people talk about. Most of the things we see there are denial-of-service with the intent of making a political statement, or attempts to compromise the system, steal the information and then publish it with the intent of calling notice to some sort of political cause or something like that. You know, sort of the typical modus operandi for Anonymous is to steal information, publish it on Pastebin, and then say: “Hey, look, these people suck. Don’t vote for them”, or “Read our message”, etc.
Political opponents – I’m not necessarily talking about the Republican Party in this case; I’m talking about activists who are against Obama. It’s really not in the interest for either campaign to attack each other; that would be really big news or problems if that ever came to light, so we weren’t really worried about that. We were worried about political activists kind of stating protests or trying to cause sabotage or fraud within our online systems.
And finally, attacks of opportunity, and this is kind of the general bucket. If you have a server on the Internet, even if you’re not any particular target, you’re going to get attacked by the background noise of people scanning the Internet for vulnerabilities. And this is usually people looking for vulnerabilities in commonly deployed software: phpMyAdmin, different Java exploits, different framework exploits, that kind of scanner activity that isn’t really looking for you, but it’s looking for anything it can exploit. So, always worry about that as well.
For example, a policy analyst, or a group of policy analysts, would be emailed by their manager – of course, from a random Yahoo email account or something – but the manager would say: “Hey, look, I’ve got this new information about this event that happened yesterday. You guys need to read this and have a report for me tomorrow.” In fact, it’s a spear phishing campaign; the attachment is, of course, malware designed to infect a computer, following the typical infection cycle of having a dropper that downloads a rootkit and then it’s communicating back to a central resource. We had this kind of thing on a weekly basis.
We had several instances of denial-of-service or attempts of denial-of-service; most of these we could trace back to some sort of online activism. Lots of threats were Anonymous, where you’d see people in IRC talking about how they hate Obama: “Let’s denial-of-service him!” A couple of hours later they’d have some sort of script ready for people to download, and they’d start attacking us. We never really saw a huge effect from that; it more or less came down to: if you could keep them at bay for 30 minutes or so, people would lose interest and just stop trying. And using Amazon’s infrastructure, using things like Akamai caching and the like, it was really pretty easy to avoid any noticeable effect from that.
This is kind of the most obvious example, we saw this online, we were able to trace it back to somebody’s account on our systems, and at that point it’s pretty easy to mitigate the effect that guy has. But it did kind of point us in the direction of implementing further features to prevent fraud within the systems. We have some pretty sophisticated algorithms designed to detect abnormal behavior in tools such as the call tool to highlight and mitigate that threat.
Snorby is an event management system, so you can take the logs out of an IDS; it’s a web application that lets you monitor, make comments, note stuff, and do investigations.
BroIDS is something else we deployed, it’s a great system. It’s not really your typical IDS, it’s more of a monitoring system that can monitor network traffic and kind of log out things that are interesting to you. We had it logging things like DNS queries, and we would push those into a database and do data mining on those, looking for activity related to malware or compromised machines. So, you can do some really creative stuff with BroIDS. Nessus – you have to pay for it, but it’s a good vulnerability assessment tool. Nmap is one of my favorite tools ever invented, so we use that a lot.
It’s a pretty frightening thing to actually use, because once you use it, you realize how effective this thing can be. I think in our first attempts of using this within the campaign we had a 25% click rate on attachments, and we had about 12% hit rate on people entering credentials online into fake websites we had put up.
We registered a random Barack Obama-affiliated web domain and had people to go to a website there, fill in their email, name and password, and click Enter. We would record the people that actually entered their information, and then we would make them take extra training, so we would walk up behind them at their desk and say: “Hey, remember that email you clicked on? You’ve got to come with us and do some extra training now.”
It ended up working: by the end of the campaign we had a much lower hit rate on those campaigns. I think it worked out pretty well, and it was kind of fun to be that evil guy who’s tricking your people into clicking stuff. In our online chat channels, when we would launch something, we’d suddenly see people pop up and say: “Hey, I got this really weird email, did you guys get this?” And people would be like: “No, I didn’t get that. What is it?” And it goes to, like, “Barack Obama” with zeros instead of O’s, and it was asking for their login information. Or we would have people emailing me in a panic, saying: “I clicked it, I clicked it, I put my name and password, and I know I did something wrong. What do I do now?” And it’s pretty satisfying.
The other general thing is just keeping people in the loop, so if you have a particular threat you’re worried about, I think it’s a mistake to hold that information back from people. I think it’s always better to inform your staff that you’re worried about something in particular happening, work through something making the rounds of people, kind of giving them all the details you can.
We had rapid growth: when I joined the campaign, we had about 100 people; within 6 months we had gone up to over 500 people in the headquarters alone. That doesn’t include field offices or volunteers, so that’s really rapid growth. We had a lot of volunteers coming through, and these are people that have been vetted in some way – they did background checks and that type of thing, but they don’t go through the same training as staff, so it’s a lot of people that come in and out of the office; we had media coming in all the time, people doing interviews constantly, etc.
We also had a very young corporate structure, so the technology team was unique in that we all had experience in our fields, we needed people that could get into the campaign and immediately start building web applications or immediately start coding. A lot of the campaign was made up of younger people who had just graduated from college and didn’t particularly have years of experience, so that youth in an organization is something I hadn’t seen before. People were basically playing their entire lives out online, social media was really popular, and controlling those messages became a challenge.
We ended up deploying Snort as an IDS solution on every instance we deployed. It gets a bit heavy in terms of resource usage, but it was great to have that insight in terms of what kind of activity each machine was seeing. ModSecurity is of course a great Apache application firewall. And again, Nmap is one of my favorite tools of all time.
And then, prior to deployment we do a code review or application assessment; typically both, where you have the code on one screen and you’re actually testing the application on the other. Burp is my all-time favorite web application tool. It costs a little bit of money, but I think it’s worth it. It’s a great web application interactive proxy, so if you haven’t checked it out, I really recommend it.
GitHub was our code repository of choice, and I think that really helps you out in doing code reviews, because you can see incremental changes, you can assign ownership to code, who wrote what, and if you know one guy is really bad at sanitizing SQL queries, you can focus on his stuff within the code, as opposed to somebody else who you know is doing stuff pretty well. So, kind of a combination of all that I think was pretty effective (see left-hand image above).
If possible, I like to embarrass people with my proof of concepts. My favorite thing we actually did within the campaign was cross-site scripting. I developed a generic proof of concept, where if I was able to include some remote JavaScript, the JavaScript was designed to replace the background image with a big dancing otter, play some music in the background, and then pop up everybody’s cookies. And I would send that to the developer, they would see it, I sat just across the room from them, so I could see them looking perplexed at their screen. I would then send it to everybody else on the team and have them take a look at it also, because nothing motivates somebody to fix a problem more than if all of their friends and peers are looking at their mistake they just made. I think it’s important not to humiliate them, but I think embarrassment is appropriate, so we were using that embarrassment to help everybody learn, and to learn from a mistake and to have that kind of cooperative learning going on.
With respect to that, I think it’s really important that developers are proud of the code they write, so if they write really good code, if you don’t find many problems with it, you should also tell them that and you should use them as a reference to other people who might not be developing as secure a code. So make them feel pride in what they’re doing.
If you are responsible for the security of applications on the Internet, you always have work to do. Even if something’s deployed and hasn’t had a problem in the past, you constantly need to keep up to date with what’s going on, researching new security issues and that kind of thing, so you just can’t stop doing anything.
Host: Ok, thank you! If anyone has a question, there are three mikes in the room. So, please stand up if you have a question.
Question: Hi. First of all I’d like to say that a few weeks back I read an article comparing your work with the work of the opponent about keeping the system running and preventing collapse, and you guys did a great job, kudos for that. My question is: when you work with security, one of the things you always encounter is the lack or not doing well of security assessment or threat assessment. Who is in charge of deciding what are the likely threats or what are the possible threats, and who decides which ones of them should be addressed?
Ben Hagen: I think it was really advantageous for us to have the developer community that we had; I think they had a very realistic view of what kind of threats were being faced on the Internet, and they could kind of integrate that into their own development lifecycle. In terms of the organization as a whole, I would say that our Chief Technology Officer and campaign management were kind of in charge of assigning risks to the larger threats that we would face, so generally, if I found an issue, I’d bubble it up to that level, and they would make decisions based on what kind of realistic threat approached the organization, or the impact that it could have in the near term.
Question: Were you compromised a lot? Like they said: “We understand, but it’s not going to happen.”
Ben Hagen: We warned people a lot. We weren’t aware of any actual compromises that we had of our data or anything like that, which is very fortunate, I think, in that regard. But I think we were constantly receiving threats from people that were very realistic, and I think the communication up the channel in terms of that was very important.
Question: When was the internal social network most effective during the campaign, maybe in some cities or some events? And what were the main issues in terms of security?
Ben Hagen: So, talking about the social network, we had an application called Dashboard; that was our internal tool for social networking, it was its own social network with its own accounts, its own login system, that kind of thing. It was very effective at organizing people at the micro level. So, people would join up onto the social network, and they would find groups that they had affinity for, for example, by interests, by location, by neighborhood, etc. It was very effective at cementing those relationships.
In terms of the social network, I think, the threats we faced were mostly fraud-related or messaging-related. We had big issues with people making fake accounts and spamming the entire board and sending messages to lots of people. So, moderation became a very important part of what we did, not with the goal of censorship, but with the goal of keeping the riffraff out of there when it tried to cause some sort of issue with it. So, I think that kind of fraud was kind of the bigger issue.
Host: We have a question from the Internet. The question is: how much did Obama himself make the job more difficult or easy? And did he request any special features?
Ben Hagen: Obama played more of an advisory role throughout the entire election. He was, obviously, the sitting President. I think most of the decisions were made by the campaign management, which is the campaign manager and senior advisors. He did play a role in terms of the messaging that we put out, general policy and that kind of thing, but he was hands off in terms of technology. He and the campaign management let us build what we thought would answer the problems that they were having. So, I think it was great to have him as a figurehead, but in terms of day-to-day business he didn’t play a huge role.
Question: First of all, thanks for your talk about the situation where you had to close down a network a little more. We get a lot of talks here about open networks, and everyone can connect with everything. I like the point that you presented here. One of the things you mentioned was that you did some internal training, like tricking people into clicking on attachments. I would think that you could get an angry mob against you quite soon. Can you tell us about that?
Ben Hagen: I think it’s important to make it more like a game, as opposed to something that creates a vindictive hatred of you or something like that. We played coy about it a lot, so we’d send these kind of things out. The training was that if you received an email that you were skeptical of, you should contact the help desk. You shouldn’t click on anything, you shouldn’t send it to anybody; you should contact the help desk immediately and get the problem resolved. I think you’re right, I think it’s a dangerous game to play with people. We kind of hid behind the fact that nobody was sure what was and wasn’t training for us. So, people knew we were actually getting these threats, we would send out information if we had a particularly wide campaign committed against us, but in terms of what we did internally, we never really let people know that there was a wide-scale thing that we were doing.
Question: At one point you mentioned that during the campaign you used AWS extensively, which is understandable by the dynamic of the network and the infrastructure. You said something about 2000 nodes in the network. Were these servers or client? What was the ratio between them?
Ben Hagen: Those are a number of servers. For example, on Election Day we had over 2000 servers grown into AWS across multiple zones, all serving the applications that we were doing. In terms of the ratio, I’m not exactly sure what that was. We had pretty aggressive scaling limits set on stuff, so things would scale up pretty readily. On Election Day we kind of threw caution to the wind and money to the wind and just said: “Scale everything up, get as much as we can,” so the ratio was probably still thousands to a machine, but we had a ton of traffic going through.
Question: And how many clients were there?
Ben Hagen: We had a lot. I think the best statement regarding that is probably what our DevOps person said: 8.5 billion requests; on a given day we had several million requests to our main websites.
Host: One more question from the Internet: how many people were volunteers and how many people were full-time employees? And did that present an additional challenge?
Ben Hagen: Sure, I think it did present a challenge just in terms of the disparity, in terms of experience, and how much time they spent on the campaign. So, obviously, if you spend a lot of time there and you are actual staff, I think you’re motivated in different ways than a volunteer. They’re both great and incredibly helpful, but there’s certainly a challenge in rectifying that difference.
In terms of full-time staff at the headquarters, I believe we had between 500 to 700 at different parts of the campaign; that’s out of the one location in Chicago. I think across the country the number was more in the low thousands or something like 2000 or 3000 paid staff in the country. In terms of volunteers, that’s a lot more; I think a lot of it depends on what you call a volunteer. I know the bigger end of numbers; people say we had 2.2 million volunteers if you take into account people with online accounts, or people that had shared information. Realistically, it’s probably more in the tens to hundreds of thousands.
Question: It would be interesting to know what kind of technologies you were using for your web applications, like Python, Ruby, .NET, or something else?
Question: How much of your system was on the cloud? And why did you decide to choose Amazon web service instead of having a private cloud like OpenStack? And you talked about open source: which networking device were you using if you had a physical setup?
Ben Hagen: So, in terms of our footprint on the Internet and our web applications, it’s something like 99% of it was in the cloud – almost everything. We worked with Amazon, because they have the most mature offering with a lot of different services that you can use. So, if you need a database, they have a database; if you need queuing, they have queuing; if you need scaling, they have scaling. I think private clouds are really interesting, OpenStack is really interesting, but in terms of having the capacity to scale dramatically in a very short period of time, you really need to go with one of the bigger providers that has the infrastructure built out already. And if you’re relying on a private cloud or something, you have some sort of limitation at some point. You might have a data center that can go this far, but it can’t go even further. Amazon lets you basically go basically infinitely large, which is something we needed on occasion. In terms of hardware in the campaign, for the security stuff that we were setting up it was just kind of a stock server that had 4 CPUs in it with some raided storage. Nothing special, basically; just kind of base 1U servers.
Question: I assume that you had a pretty decent SIP infrastructure. I was wondering if you saw any interesting exploits that involved SIP specifically, where people were trying to exploit the phones?
Ben Hagen: We saw scanning, we saw SIP-focused scanning; we were actually using a lot of Microsoft’s Lync back-end for SIP, which is an interesting conglomeration of different services. Aside from scanning, I don’t think we saw anything particularly interesting from it.
Question: Did you see any impersonation attacks? You know, “My alternative Barack Obama site, give me your credit card numbers;” and how quickly were you able to take them down if they were hosted some place dodgy?
Ben Hagen: I think we certainly did see that, and I think the Romney campaign saw the same thing, and I think you have very limited options other than communicating with law enforcement or communicating with the hosting providers to have that kind of thing taken down. We kind of relied on the community in terms of security community, but also like reddit or any number of places where very quickly you can bubble that thing up, and monitoring those kind of sources for fake impersonation sites is a really great way to find it quickly. And I think the only options you have are to warn people or to approach law enforcement and have them take it down.
Question: What did your preparations for disaster recovery look like?
Ben Hagen: Amazon makes things interesting, and we had a number of potential disaster situations come up in the course of the campaign, where we had a hurricane coming down on one of the major Amazon data centers. Preparations for that generally involved replicating as much of the infrastructure as possible into a different availability zone, so getting as much as possible working in another data center, essentially. In terms of the campaign itself, we had kind of the typical corporate disaster recovery, with secure offsite storage of critical files and disaster recovery of individual laptops, with essential backup of information, imaging of computers, that kind of thing. So, nothing terribly sophisticated on the corporate side; I think the interesting disaster recovery stuff is with the cloud services.
Question: Could you elaborate a little bit more on how you detect those fraud calls where someone is using your system in order to make Romney calls? How did you detect that and how did you prevent that from happening?
Ben Hagen: Basically, we were looking at the velocity of potential calls coming from individual users, and then also the response they were making to calls. So, essentially, if you made a call, we ask that you record information like: did somebody answer the phone? Were they not at home? Did they call and were they a supporter, were they pro-Obama, pro-Romney? There are several possible answers to making a call. Basically, looking at the typical spread of answers and looking at the velocity with which people can make the calls and actually get legitimate answers gives you a lot of information regarding what is and is not a valid user. So, detecting fraudulent activity becomes pretty easy as long as you’re not using kind of sophisticated impersonation tactics. We assume they didn’t have the insight in terms of what the typical behavior would be like so that we could rely on that more.
Host: There is another question from the Internet, and it’s a follow-up to the previous one: I’d like to know how the procedures, how things were documented, and how you tested which impact the disasters might have.
In terms of deciding the impact to different scenarios, we had things we called “game days”, where we set aside a weekend or a day of a weekend, and we forced developers and DevOps people to contend with random issues with their applications. For example, on a staging network or a replicating environment, we would say: “Your IDS MySQL instance is down. Your application needs to be able to recover from that. Let’s try it out.” And we’d kill the database connection, see how it responded and try to get as friendly a response out of that as possible, either failing over to another application, redirecting a user, or presenting kind of static information. So, going through that entire stack of possible problems and figuring out the most graceful way to fail was a big part of those game days and the big goal in preparation for things like Election Day.