At the end of his AIDE Conference presentation, Adrian Crenshaw describes a few more types of darknet attacks and enumerates a number of general takeaways.Also, things can be done to affect timing (see right-hand image). This is where sybil attacks can help augment traffic correlation attacks. Let’s say someone is sitting there watching the timing; that can reveal information. They can also just sit there and kind of control how fast the traffic goes through them; this can be similar to a tagging attack. The way I2P works is it signs the data, so if someone is modifying the data it’s going to be an issue. I supposed if you slow down the packets and put a certain rhythm to them, you might be able to follow there along the line. There’re also people who have done various attacks in Tor, changing the load on certain nodes to figure out who’s talking to who, or figure out who’s going through which nodes and reduce the anonymity set as well. Mitigations for this (see left-hand image) would be things like more routers. The bigger the network is the harder it would be to find the smaller needle in a much bigger haystack. Also, people talk about using Entry Guards: if you’re both in Tor, the first hop and the last hop in a network, it’s very easy to figure out who you are and what data you’re sending. If the attacker is the exit point, they’re seeing unencrypted traffic, assuming you’re not using an encrypted protocol. If they’re also your first node on the entry point, they see the amount of traffic you’re sending, and it’s much easier to figure out that this person is the person who was sending out this data coming out of this exit point.
So, Tor does a couple of things to mitigate this. One would be Entry Guards that chooses a certain set of people that always contact. If it randomly chose people to peer through every single time, eventually the attacker would be both the exit point and the first node to hop in to. However, by choosing a certain set that you always use as your entry points, possibly you would have really bad luck and choose malevolent peer the very first time.
One-way tunnels can help because they definitely seem to confuse information, at least when I’m trying to sniff traffic in I2P. Short-lived tunnels may help so that you’re not sending as much traffic through the same nodes. Basically, you use these sets of nodes to route through for a little while, and then I’ll set and change it to a whole new set of nodes. Better peer profiling to figure out who’s bad actor, like if you know what person only tends to send traffic in certain ways or at certain times. Signing of the data – I know I2P does signing of the data to make sure it hasn’t been modified; I’m pretty sure Tor does as well. Fixed speeds are another issue; some networks have been proposed to keep doing timing attacks, and if they see that some people are always sending data at same speed, that could help set some correlation.
Padding and chaffing – if you’re sending data out there and you’re worried about people doing analysis on the amount of traffic you’re sending, if it’s padded it’s always the same size. Chaff would be kind of the opposite thing, like, someone sends out a bunch of data that’s padded, and now they’re all of a sudden dropping off the unneeded data before they send it out to the next node; so the sizes of packets going from this node and this node can’t be easily correlated. Non-trivial delays would help in some cases, and that goes back to some of the stuff I covered earlier.Intersection and correlation attacks – this can be related to some of the earlier attacks as well (see right-hand image). This can be as simple as knowing who is up when a hidden service is available. Let’s say you’ve logged all the people you know inside I2P. And you log whenever this particular eepSite, the hidden web server, is up. If you notice that one particular I2P router is down at the exact same time that this eepSite is down and it’s always like that, that might be an example of a correlation attack you could do.
Techniques can be used to reduce the anonymity set. I suppose when someone starts knocking off various machines on the Internet, like you did it and eepSite is still up – ok, that must not be you, and so on and so forth. Application flaws can also reduce the anonymity set. I’ve mentioned before, when I was doing some research on I2P I was checking for what particular web server software each machine was running. Well, if I log all this and I know you are running this particular version of Apache, I can only check boxes that have that particular version of Apache. I’ve reduced the anonymity set, I’ve reduced the number of boxes I actually have to check to see whether or not it’s the same person. This also goes back to harvesting attacks, where you profile the different nodes in the network.Here’s an example of a simple correlation attack (see left-hand image). Let’s say they’re trying to contact the Tor-hidden server. They go and check to see whether or not it’s up, and then they check every other node in the network really quickly to see if they are up. Eventually, if one is down at the same time as the hidden server, that might give you an idea that that’s same person. Within a really big network, this attack would be difficult to pull off. But it’s a really simple example of a correlation attack. Let’s say you know the IP addresses of a bunch of the routers, and you want the routers that actually host an eepSite (see right-hand image). You might be able to find out what software that eepSite is running. Then, all the IP addresses you’ve harvested out of the distributed hash table, you can check each one of those to see whether or not it’s running the exact same version of the software. Then, each one that is running the same version of the software, you can request that site. For instance – this is one of the attacks I was doing – let’s say there’s some site called somesite.i2p. I might request it directly and go: “Ok, this is the certain software that you’re running and you’re returning that information to me.” Now I want to see who has the same server software, and in my host header, in my HTTP protocol, I’m going to request that particular website from you. If you return that website to me, I know it’s you. You could actually de-anonymize some people in I2P that way. General mitigations – of course, more nodes would help. The more nodes there are the harder it is to pull off these attacks. I think I was only dealing in I2P with around 6000 nodes at a time, and it was doable on a home machine with a cable modem, but the more nodes you have the more difficult that would be. If you’re hosting a server inside I2P and it’s an HTTP server, it strips out that server software header, so it’s not as easy to correlate. Giving less data, of course, makes it a lot harder to pull off these kind of attacks because you can’t reduce the anonymity set, you have to check more nodes to see.
Another thing is to make harvesting attacks harder to do. For instance, it’s easy to access a Tor router because you can access the directory server and it gives you all that information; however, you can’t easily harvest bridge routers because they don’t put all information in one place. That might be an example of making harvesting and scraping harder to do.Ok, we’re almost done. If you want to have more information on various research into anonymity networks (see left-hand image), check out the archive that Freehaven has. Also, if you want more information on different threat models, I2P has a great page on that. I have a general darknets talk that I did earlier here at AIDE. And I also have a video and article on de-anonymizing eepSites inside of I2P. I’d like to say a few thanks to the conference organizers for having me here; Tenacity for helping me get to Defcon; my buddies at Derbycon and the ISDPodcast; and also the Open Icon Library for helping me out with a lot of the “artwork” (see right-hand image).