Andy Ellis now makes emphasis on risk reduction in a long-term perspective, concurrently highlighting some scare techniques security vendors tend to leverage.
Now let’s look at some ways that people act, and I’m going to include a couple of my anecdotes here. First one isn’t me. So, I went and took 3 different terms that I thought would be interesting, and grabbed news headlines. If I wanted more money – and I’m in a habit of begging for money – here’s what I might do.
I might say: “Oh my goodness, look at this, we had a data breach in Utah. The CTO is responsible for that. Mr. CTO, do you want to be responsible for that? I didn’t think so, so we need full disk encryption right now, and a data loss prevention tool. I’m going to go get some money.” People do tend to discover this.Or I might say: “Hey, look at that, you could suffer a denial-of-service attack. It doesn’t even matter if you’re a for-profit business, we need to go buy DDoS protection right away.” I’m actually in the business of selling that, full disclosure.
Or we might say: “Look, web application vulnerabilities, this is the hot new thing, so we need to get a WAF! And maybe we should have either a dynamic application testing tool or a static one, and look at our coding practices, and see what we can do.” Anybody ever seen any of these dynamics? If you’ve ever actually been marketed by a security vendor, this is what they do. They will walk in and they will throw headlines in front of you and say: “Now buy our technology, because you don’t want this to happen to you.” The problem is, this comes back to that model of creating perceived risk and then not actually reducing it. People stop believing that this might happen.So, instead of begging for money, I would argue that you should waste your crises. This is the real thing; this is a real breach that happened to Akamai 11 years ago. Fluffy Bunny, if everyone remembers Fluffy Bunny, had this really cool SSH Trojan, where they would break into a system, they would get root access, they would replace SSH, and then when you logged into somewhere else, it would record your password, your RSA identity, and your passphrase. And then they would come back again, they would collect those and they would go log into another system using that. Very neat technology. Thank goodness, they had a bug: it didn’t actually overwrite your private key. So, if you used key-based authentication, you were safe, simply due to a coding error. I’ll always take a coding error on the part of my adversaries.
So, they broke into one of our systems, we actually noticed almost immediately, simply due to blind luck. You notice the second to last line here has a password at the end of it, it’s about 30 characters long (see image above). When Fluffy Bunny went to use this to log into MIT, the person’s login scripts automatically started an instant messaging service that sent out a broadcast ping saying they were online. The person was standing in a dorm room of one of their friends when they got the notification that said: “I’m online.” He said: “That’s very odd, because I’m not.” He said: “I only ever log in from 2 places: my laptop, which is in my pocket, and from Akamai.” Clearly, Akamai suffered a breach. He gave us a call, we went, we looked at this, and we dealt with it.
Now, what did it mean to have dealt with this? We could have said: “Oh my goodness, this is a crisis, this requires us to completely change our login infrastructure.” And we actually had people arguing for that, saying that we should take this opportunity to rebuild everything from the ground up. And we looked at it and said: “Well, rebuild everything from the ground up isn’t something we can do overnight. But overnight we can do some pretty simple things. That login server, why is it that people actually get a shell on it, why isn’t it just a pass-through gateway?”We turned our SSH gateways into VPNs instead. We would still use SSH but they’d behave like the VPN. Now we didn’t have that risk. We changed some of our authentication models. So, we took that and we said: “Here is this big risk; let’s act to bring it back down, let’s make the business more comfortable, and instead let’s take a long-term approach and look at our risks to effect long-term change, because that system wasn’t our biggest authentication risk. This was our biggest authentication risk.”
When I started at Akamai, every single developer had a copy of a root key for logging into every single machine. That’s a way bigger risk than what Fluffy Bunny presented to us. Fluffy Bunny got access to a corporate network machine and stole some source code. I had developers who had left the company that had root access to every machine – this was far more dangerous.
So, we looked at this and we said: “There are lots of possibilities here, and we know what our end goal is, but nobody will let us get there day 1, it’s too hard. But what we can do easily is we can convince people that we need to rotate that key. People left the company with it, we’re going to give out a new one. And, in fact, even better than giving out a new one, we will set up a gateway machine that people can log into, the new one will be preloaded on an agent for you, and if you need to log in somewhere, you can do so. We’re not going to force anybody to change their practices.”
Several developers said to us: “We’d like you to give us the new key, just in case that system doesn’t work.” We said: “Great, the old key is root at Akamai, we will give you a copy of root 2 at Akamai.” It made them very happy, so we did roll this forward and we loaded into those SSH agent root 3 at Akamai. And everybody was able to only get out to the network from those machines; some people were left with their useless key. We told them we’d had a problem in the rotation and moved forward to root 3 instead.
But that was just a step. So, we brought up: here’s a risk, developers have left the business, now we’ve dealt with that risk. Then we said: “Oh wait, we still have that problem. Now we have people who are logging in through the machine, we don’t really know who’s logging in where, and people can log in everywhere. That’s a problem. Now let’s deal with that as a separate initiative.”
So, we built our own SSH application proxy: users now log into our deployed network using their own identities, going through a proxy, which has on it a role-based access control list. It says: “Andy Ellis, you’d like to log into this machine with a read-only account? Well, you have a grant that will let you do that.” And it will pass it through.
But more importantly, it will say things like: “The developers of our mapping software can log into up to 10 mapping machines per day.” We don’t have to specify which 10. As they log in, they get access; when they hit 10, they’re done, letting us enforce least privilege in the same fashion without having operators trying to grant access continuously throughout the day.
And more importantly, our developers view this as an improvement over having to log in through a gateway machine. So, we did a 2-step change that not only dealt with the first vulnerability, dealt with the second one and let me cut out over 99.9% of all access to our deployed network at the time. That was massive.
Now we can actually say things like: “Only a small handful of people can log into even remotely large numbers of machines.” If you just saw one of the previous slides, you saw Fluffy Bunny said: “What if somebody executed a ping flood from 6500 machines?” I don’t have 6500 machines anymore; I have a 100,000 more than that. That’s a very large weapon of mass destruction; I really don’t want even my own employees able to use that weapon.
So, these are ways to think about effecting long-term change: know where you want to end up, and figure out what are the steps to get there that your business will tolerate, because it will not tolerate too much risk reduction either.