Quantcast

A Forensic Analysis of Android Network Traffic 3: Data harvesting by Zynga and Words With Friends

Read previous: A Forensic Analysis of Android Network Traffic 2: Research methodology

Eric Fulton’s focus here is on the types of information Zynga and Words With Friends apps are collecting about their users, based on packet capture files data.

Analyzing packets with Wireshark

Analyzing packets with Wireshark

So let’s start analyzing. With each packet capture, I first peered around within Wireshark; I analyzed some of the conversations, some of the IPs being addressed; I ran strings, ran ‘grep’1 – pretty easy Linux stuff; and then I did some DNS play, and I did some Argus2 flows.

So first – Wireshark (see image). If you guys haven’t done any network analysis, Wireshark is kind of the ‘de facto’ GUI tool. It’s really nice just to kind of poke around and scroll. And you can just visually look at and see: oh, this is HTTP traffic, DNS traffic etc. It’s a good starting point, kind of gives you a feel for the lay of the land.

Further analysis workflow with Tshark

Further analysis workflow with Tshark

But command line tools are more powerful, and so I moved to Tshark (see image). So I wanted to basically read the packet captures, look around, see what was happening, look at the conversations that were happening. And so I ran Tshark and I tried to see who these applications are talking to; what services they are using; who they are sharing it with. And then I Whois’ed like a mofo.

Servers Zynga communicates with

Servers Zynga communicates with

So if we would take one specific example, we could look at Zynga. How many people here know what Zynga is? Oh, nice, I should have assumed this is Defcon, you guys are smart. And for those who have not raised their hands, Zynga is kind of the new mogul, if you will, for Android games. Zynga makes a large number of those idle time games. You might need those games when you kind of sit down, as I stated earlier, you know, you haven’t got anything to do and you want a game that you can play for 5 minutes, or you are on a conversation that is really boring, and you can play for 5 minutes. And they are widely popular because people have a lot of idle time.

And so I took a look at Zynga, and I was like: “Who is Zynga talking to?” (see list above). Well, if you look at the image – and I am not gonna read each one out – there is a lot. There’s TapJoyAds, Mydas, Facebook, Macromedia, Adobe.

And when you look at this, there are a couple on there that you are curious. I mean this was for Zynga poker. And so, you are playing poker on your phone, and you don’t really expect to call out to Mydas.mobi. What does this company do? What does Mkhoj do? What does TapJoyAds do? What information is being sent to these third parties that you have absolutely no idea?

And this is where we get to the privacy element. When you downloaded that application for poker, did you really understand that you were gonna be sending your statistics, your Android version, your location potentially, to Zynga Poker? And why do they need to know it?

So this is kind of a really big question: what is being sent on your phone without you knowing? I brought this question up because I was thinking about what applications I have.

Using strings to analyze packet capture file

Using strings to analyze packet capture file

Well, one of the easiest, and quick, and dirty ways to look at a packet capture file and see where it goes – is the strings. String just basically outputs text strings that are inside a packet capture file, or any file for that matter.

Basically what I did was I looked for interesting things. And you see on here, one of the first things I did was the HTTP, trying to see what websites are being contacted. And then I had a couple of key phrases. And I did this for a couple of reasons. One – I don’t wanna have to go through every packet capture file, trying to figure out what password was going through.

Apps exposing password and email

Apps exposing password and email

I made some basic things to look for. I made ‘w00tdefcon’ my password. I made my username droid.net.foren@gmail.com. And for those of you thinking: “Oh, he left the password, I’m gonna go log in” – yes, I did, I don’t care. I am not using it anymore.

Basically what I did was I put a kind of cookies within the packet capture files and get it to instantly grep for w00tdefcon, and I could instantly see where my password was shown. I could instantly see that, rather than trying to figure out what the password field is called, or whether it is in the GET parameter, the POST parameter, whatever. I just got w00tdefcon going over the wire. I also did it for my email address.

Well, when you look at it, w00tdefcon is definitely going over Facebook, obviously. I mean you have to log into your Facebook to actually get the alerts that you wanna see about your best friend and the update.

But what we did realize was that Facebook, Words With Friends and Zynga Poker also know my email – again, that can be assumed. But beyond that, any attacker can capture this. And this is why I am really tied on the privacy element. And this is where privacy kind of intercedes with what I am doing.

So we have it to where as an attacker, or as a ‘man in the middle’, I now know potentially your password for your Facebook, your Facebook URL domain name, etc. – all because you are playing poker. I now know potentially where you are located and potentially what you are doing.

User data collected by 'Words With Friends'

User data collected by 'Words With Friends'

And when we dealt with Words With Friends, we could see very interesting things (see image). And so this is an output that I got from running strings on Words With Friends. And again, this is all very simple stuff. I mean, I am not doing extremely advanced packet analysis. This is quite simple. If you have a Linux VM, or Linux Box, you can all do this. So I ran strings on the capture file that I had, and I found this: I found Words With Friends is sending a couple of interesting things. One – they are sending the network that I am on. So they know whether I am using AT&T, T-Mobile, Verizon, etc. So now they know that my phone is Verizon. And they know that I am a Millennial, which I found was kind of weird. I think they are guessing. But they also know what my build version is for my Android, they know what apps I am using. Some of these are hypotheses, and some of these are facts. And I am guessing they know when I am located based on my distance to the ad server. They know what screen resolution I am using, what language I am using, etc.

And for my testing some of this didn’t quite show up because I hadn’t fully set up the phone, so they were not able to send a couple of things because it didn’t have anything in there. But it definitely let you know what they are sharing.

Other data 'Words With Friends' knows

Other data 'Words With Friends' knows

And so I continued on. Okay, they have got my email, they also have my device ID. They also know that my last word was ‘about’, and I got 18 points for it. But that’s pretty obvious. But in any case, they know the time I was accessing it, they know my email, they know my device ID etc. (see image).

What’s important about this? Well, I can only assume that my device ID is only my device. I can also assume, or I feel safe assuming, that Zynga has a number of different applications, and in every application, they know that my device is using it. They know that I am using their game 1, 2, 3, and 4.

But then we tie this to some larger eco-system issue – advertising, and it is one of the largest eroders of privacy, because they want to know as much as possible. They want to know exactly who you are so they could market directly to you.

So we take it from Zynga, and we move to a higher level of the advertising agencies that Zynga cooperates with. Now that they have my device ID from Zynga, they can also tie it to their other partners’ sites if they can pull my device ID. And then they can tie all these separate pieces of information, that I never really thought someone else would be collecting, and they are starting to put it all together.

Read next: A Forensic Analysis of Android Network Traffic 4: Geolocation by Google
 

1grep is a command-line utility for searching plain-text data sets for lines matching a regular expression.

2Argus (Audit Record Generation and Utilization System) is a fixed-model real-time flow monitor designed to track and report on the status and performance of all network transactions seen in a data network traffic stream. Argus provides a common data format for reporting flow metrics such as connectivity, capacity, demand, loss, delay, and jitter on a per transaction basis.

Like This Article? Let Others Know!
Related Articles:

Comments are closed.

Comment via Facebook: