Samy Kamkar continues his talk “How I Met Your Girlfriend” on hacking Facebook account, shedding some light on the use of LCG for reducing the amount of entropy
So let’s take a little closer look. If you’re familiar with Facebook, it has chat (see image). When you go online, when someone logs in, and you look in chat window, you actually see that person come online. How is it happening? It’s happening with AJAX1, your client is continuously checking with Facebook to see if someone else is logging in or someone is going offline, just to understand and get an updated status.
Well, if you use something like Live HTTP Headers (browser add-on) or a packet sniffer2 or something, you can see the HTTP requests going back and forth. And you can recreate those if you’d like. And what you can do is you can just send that request “Is there anyone new online?” every single second.
As you’re sending it every second, one of these times RSnake is gonna go on because he wants to check if anyone poked him. The cool thing about this is that if you see here in the red, the date in red is sent from the server, the server sends us their local time. Our local time doesn’t help us too much because we don’t really know the difference between our local time and the server’s local time. That local time helped create that cookie if you recall.
So, that 32 bits, if we are checking every second we can now reduce that 32 bits as soon as RSnake comes online, just by watching that every second, write a program to do that. We see him come online, we take that date, convert it to epoch – we just reduced 32 bits. We’ve now reduced the 160 bits by 44 bits down to 116 bits. That’s awesome, still a lot.
Let’s go further. So he comes online, we can send him a message. Why not send him to my blog namb.la? So you send him there, and then what you do is you just track the IP address. Don’t worry, there is no XSS3, there is nothing on there, if he is running Nosript or something that’s protecting him, there is nothing malicious on my website. So he goes there, nothing happens to his browser, he sees a really cool blog post about how I did a Defcon talk and everyone loved it.
And what we do is we track the Apache logs and we see his IP address. There is another 32 bits of that cookie. So we now downed it to 84 bits from 160, basically half, well it’s not really half, okay, bits – it’s so confusing.
So what’s left here that we don’t know is 20 bits of the microseconds and we’re not gonna guess that. There is no way we’re accurately gonna guess the microseconds that someone logged in on a remote system. You might be able and try really hard if you got system really close and time things really accurately, but it’s not worth it.
So the only other thing left here is this random ‘lcg_value’, 64 bits. What is this? An LCG4 is a Linear congruential generator, it’s a pseudo-random number generator. It’s been studied for years, since like 25 years ago or something, it’s older than I am. So it’s really well studied and really well understood. You can actually look up information on how to reverse that. LCG used here is actually 2 LCGs that are combined, so it’s a little bit harder. I didn’t understand that too well, but I looked over the code anyway.Well, as soon as the LCG – the random number generator – is called, it’s seeded. The seed basically provides the actual random data that provides every single random number from here or now. Now the seed is critical to the randomness of the PRNG5.
So let’s take a look at the seed function here – this ‘lcg_seed’ at the bottom side of the screen. What you’ll see is… there are 2 parts of the seed, and it’s 64 bits. It’s 64 bits of entropy, and every random number is also 64 bits. It’s split into 2, called s1 and s2, each 32 bits long.
Now s1, as you can see, is the thing called ‘tv_sec ^ ( ~tv_tv_usec), what that is… that’s epoch, as soon as it’s seeded, exerted with the ones complement of the microseconds that PRNG was seeded. s2 is the process ID.So let’s just take a look at s1 a little bit more. What you are seeing right now on the image is 32 bits of entropy. The interesting thing about this is the seconds that the PHP was seeded, was probably when the web server started. We don’t necessarily know when that happened but we can, we can potentially make an estimate. We can also send thousands and thousands of requests to web server to get it to reset, and to get, to figure out when it started.
One of the issues is that we wanna know which request caused that reset, but we can make an assumption that it happened the last hour, if we send enough requests. Now, what they do to make this harder to guess is they exert with the microseconds. There is no way we’re gonna know the microsecond that the web server started, but they’re exerting the most variable data of the epoch with the most variable data – microsecods. The fixed data like what year, what month, what week it started – remains the same. Basically it means the fixed data remains fixed.
So we end up with 12 bits that we can guess if we know within a 12 day period of when PHP started. And again, if we don’t, we can send enough requests to get that. The other 20 bits is just microseconds exert with the other variable data of epoch, we don’t know that, whatever, that’s 20 bits. We just reduced 12 bits of entropy in the random number generator.
So let’s look at s2 – process ID, get process ID. That’s 32 bits of entropy. Well, process IDs on Linux are only 15 bits long. So immediately you reduce 17 bits. And if you can execute PHP through any function, if you can acquire, if you can execute a program like ps, if you can hit an Apache servering for page or something like that, you get the entire 32 bits. We’ve now reduced 64 bits down to 20 bits in the PRNG.
We are now at a total of 40 bits of entropy from a 160 bit cookie, for every cookie in PHP, that’s awesome. But wait, there is more. We can take, normally we think like 20 bit and 20 bits – that’s 40 bits. Well, we can actually calculate the PRNG, the LCG value, the 20 bits that we didn’t know – separately. Separately means we calculate that 20 bits, and then the other 20 bits – that’s only 21 bits.
We can calculate with the time-memory tradeoff6 and code I’ve created. We can calculate that 20 bits in a matter of seconds. That reduces the other 20 bits, and now it’s exactly 20 bits. The 20 bits of entropy from the microseconds that the user authenticated, the cookie. On average, we will be able to log in as him with 500,000 requests, which we can easily do in a day. So I’ve done it. I’ve become RSnake!
So what can we do at this point? Well, first let’s understand how do we fix this. Make sure you are running a new version of PHP (PHP 5.3.2 or later). I sent this over to PHP, and they quickly released the patch, they added some more entropy. Or create your own session values. Use your own randomness. One of the great things about PHP is that it’s very fast, it’s meant to be fast. OS is basically cross-compatible. So they don’t do too much that’s very system level, like they’re not gonna access /dev/random because that’s all on Windows for example. So create your own session values or seed your own random number generator.
You don’t need to understand crypto, just use a strong seed that your system comes with. If you’re running on Linux there is BSD or something like that, you know, use /dev/random for your seed. Don’t use the process ID. The attack is difficult to execute. It’s much easier on social networks, where I spend most of my time unfortunately.
One thing to note, Facebook is actually not vulnerable, this is not an attack on Facebook. If you’re familiar with Facebook, they created their own version of PHP called HipHop. It’s sort of compiled with C++, supposed to be much faster. I love Facebook. If you could plant some crops for me in my Farmville, I would appreciate that.
So at this point, what do we do? I am logged in as RSnake, how am I gonna meet this girl, you know? She is happily ‘boyfriended’, I don’t know what the word is. So using his cookie, I can now message her as him. So what do I say? Oh, here is the thing, why don’t I send her to a malicious URL: namb.la? We’re gonna attack her network now.
We’re gonna learn a little bit about a network and a NAT here. So a NAT – what it is it’s basically a system that allows you to run multiple systems behind one public IP, in a nutshell. All of your computers behind the NAT will typically run in private IP space. Well, typically your cable modem provides only one IP, so use router which contains NAT software. And that will allow all your network devices to run behind the NAT.
It also is some kind of a firewall. It prevents people from accessing services and ports that you have running on your computer, whether you know it or not. So when you go behind the NAT and you’re running, let’s say, Apache on port 80, no one can connect to you, except internally on your network, unless you go to the NAT and you enable port forwarding or DMZ7 or something else.
1 – AJAX is a group of interrelated web development techniques used on the client-side to create asynchronous web applications.
2 – Packet sniffer is a computer program or a piece of computer hardware that can intercept and log traffic passing over a digital network or part of a network.
3 – XSS (Cross-site scripting) is a type of computer security vulnerability typically found in Web applications that enables attackers to inject client-side script into Web pages viewed by other users.
4 – LCG (Linear congruential generator) represents one of the oldest and best-known pseudorandom number generator algorithms defined by the recurrence relation.
5 – PRNG (Pseud-random number generator) is an algorithm for generating a sequence of numbers that approximates the properties of random numbers.
6 – Time–memory tradeoff is a situation where the memory use can be reduced at the cost of slower program execution (and, conversely, the computation time can be reduced at the cost of increased memory use).
7 – DMZ (sometimes referred to as a perimeter network) is a physical or logical subnetwork that contains and exposes an organization’s external services to a larger untrusted network, usually the Internet.