Read previous: Searching for Malware 4: Exploring Twitter Accounts
David Maynor and Paul Judge introduce the concepts of Friends-Followers Delta and the Tweet Number to explain the essence of Twitter network misuse.
Paul Judge: We wanted to better understand what are the behaviors and properties of Twitter accounts that get suspended. One of the things that we looked at was the Friends-Followers Delta (see graph). The Friends-Followers Delta is reflecting the difference between the number of people you follow and the people following you. The thing that we noticed is that the attackers are using pretty aggressive recruitment activity to get a higher number of followers, so their delta is higher. What you see here in the green space is the delta for legitimate accounts: on either side, people that have more followers and people that have more friends. But for the suspended accounts, you see a very much higher delta because either they’ve successfully created a higher number of followers or they are still in the process of following people, so those people can follow them back, and so they have a higher number of friends. That’s a pretty interesting attribute to use, to get some separation.
The other thing that we looked at to get some separation is this number that we call the Tweet Number. And the Tweet Number is pretty simple math: how many days you have been on the network, and we divided that by how many tweets you have sent. So it’s basically on average how many tweets you’ve sent since you joined Twitter. For example, my Tweet Number happens to be 1.8. I think Dave’s is 3.2. So it’s interesting, I know some friends, a couple of guys in the room, whose Tweet number is 40. You know, 40 is like tweeting every 15 minutes in a work day. It’s like – wow, it’s pretty high, you kinda annoy. But then there are some other accounts, if you look on this (see diagram), that are actually tweeting 100 times a day, but it’s only 0.19% of the population. Seems like okay, it’s only a small number of people, 0.19% of population. But what happens, if you think of that 0.19% of 50 million users, we’re talking about a couple of hundred thousand users. Now, when you’re talking about a couple of hundred thousand users tweeting at least 100 times a day, you’re talking about 19 million tweets out of 50 million users, you’re talking about 38%of all the traffic on Twitter. So over the third of the traffic on Twitter is being generated by this 0.19% of the population. We thought this was a pretty interesting attribute.
So, what we did from there is we kind of really looked into how we can begin to build some level of reputation by coupling these features together, coupling together this Friends-Followers Delta along with the Tweet Number. And as we did it, we got some interesting graph of separation, interesting clusters of user types. David will step us through some of those.
David Maynor: So this is the Friends-Followers Delta on a positive side, which means there are generally a lot more people following them than they are following, and the usual suspects of that are foxnews (see table). Number 4 and number 5 are that Bieber kid. And the xMileySupporter also makes it on the top list there.
When you go from the Friends-Followers Delta, you know, like 119,000 down to 4000 or 5000 range (see table), you get people like iSkeetThenTweet, which I don’t know what that means. LiveBloggerJobs – it’s more like localized recruitment kind of things, I don’t know what the iSkeetThenTweet is recruiting, but it looks like the rest is stuff here is financial news and that kind of stuff.
And then the lower you go and closer to zero, you’re starting to see some scammers (see table), like the Moneywholesale, you know, well nobody uses the Moneywholesale, none that I’m aware of, but if you do, I’d like to know about this. And LA_Restaurants – there is not really no other good place to eat in LA except for Pinks.
So, when you get the negative numbers, you definitely find scammers (see table), like, you know instantbiztips, Cam4porn (I don’t know what that one is), tweetstockstips, and this www.365buying.com. So the lower, the further down, the more distinct the scammers become.
This is an example (see screenshot), this is the site of a Twitter follower, he’s got a Friends-Followers Delta of -325, but he’s got a Tweet Number of 108.9, which means he is basically tweeting all the time. But no one is really following. And if you take a look at it and if you go to the site, it’s a free software site where you can download stuff for different activities. But if you take a look at the Google Diagnostic Page, it becomes more clear what it actually is – well, 10 Trojans, 4 exploits, 1 scripting exploit in the last 90 days that Google scanned it. So obviously, it’s not a very good site.
And with that, we are going to the top 10 search terms used by malware. This is actually the money shovel part of the presentation.
Paul Judge: So we looked at what’s going on different search engines, we looked at what’s happening on Google, Bing, Yahoo!, we drilled into Twitter a little bit more to see how the attackers are creating fake accounts, you know, we saw that over 70% of the accounts on there really aren’t using the network. We looked at types of categories that malware likes and ones that malware doesn’t like. And so we learned a lot about how this is happening, the scale of the search engine optimization attacks. We’ve talked about our categories, we’ve talked about the fact that we did this in 57 days, we saw over 25,000 search terms, 5 million results.But out of those 25,000 search terms, there are some that are more popular than others. There are some that are kind of used more by attackers. And so we want to understand what those search terms are, which one are being used. It’s a very wide set of things. On the list we had a couple of NFL players, we had some politicians, some actresses (see image).
If you look at one of the guys on the list, the guy’s name is Adam Wheeler (see photo). Anybody heard of Adam Wheeler? It’s a guy who cheated his way into Harvard. He forged his transcripts and got Harvard full scholarship, and now he is facing about 20 charges: identity fraud, forgery, larceny etc. So the poor guy is gonna get into some trouble now. So as this news broke, he became one of the top.
So look at the top search term, it was a lady named Lois Wilson. Lois Wilson and her husband started Alcoholics Anonymous. The reason she was trending is on April 24th, there was a movie that came up that told her life story. And if you look at all the results on malware that we found, she was the top search result that was being used. You know, this is Defcon, that’s not a very interesting term that stays in the top of the list. So we went to our scientific pull, and we said: “Hm, let’s really understand what’s our favorite search term that was used by malware, kinda what’s the viewers’ choice?”