Finding Doppelgangers: Taking Stylometry to the Underground
Sadia Afroz, UC Berkeley) is using stylometry to find who is interacting on underground forums (cybercrime forums). You want to figure out what this guys is doing there in the first place and who really is doing the work.
Current research around deanonymizing users in social networks is focused around similar usernames – but if you really care about being anonymous, you won’t fall for that trap. The next thing to look at is similar activities or social networks. For most people you can see that they will write a facebook post and a tweet on the same event/activity, so easy to find the match. This doesn’t work for underground user forums, though. So, instead they are using Stylometry to analyze the writing style.
Stylometry is based on everyone has a unique writing style – unique in ways you are not aware of, so its hard to modify. To do this, you analyze frequency of punctuation and connector words, n-grams, etc. But, you need quite a large writing sample to analyze, the larger the better – but still can get some accuracy on small samples.
They looked at four forums, 1 in Russia (Antichat), 1 in English (BlackhatWorld) and 2 in German (Carders/L33tCrew). People move from oe forum to another, but not always easy for researchers to get the full data sample.
Problems? These forums are not in English… often in l33tsp3ak (pwn3d). Also, people aren’t speaking with their natural voice, they are making sales pitches (more likely to overlap with other accounts that aren’t actually the same person)..
They parsed l33tsp3ak using regular expressions, and additional parsing for „pitches“ vs „conversation“ (if there are no verbs and repeated things in lists, it’s most likely a sales pitch and was eliminated).
Then it seems to be all about probability – what are they likelihood that these are the same person. Lots of analysis followed, like: do these accounts talk to each other or about each other? Are there similar Username, ICQ, Signature, Conatct information, Account information, Topics. Did they ever get banned? (moderators do not like multiple accounts for one account)
People can sell their accounts – accounts that have been established with a higher rank could be sold for more. Some people also want to „brand“ so they can sell different things with each account (like CC numbers with one, marijuana with another).
You could avoid detection by writing less (lowering rank), or you could use their tool, Anonymouth 🙂
From Phish to Phraud
Presented by Kat Seymour, Bank of America, senior security analyst. talk started out great with a reference to Yoda. Every talk should have a reference to Yoda!
Phishing used to be around silly things like weight loss pills and male enhancement pills. But, it’s grown up – there’s real money to be made here.$4.9 billion lost to phishers last year.
Attacks come from all over the place now – mobile, voicemail, emails, websites… and they’ve matured. No longer plain text filled with spelling errors, they now are stealing corporate branding and well written emails. They are stealing websites that aren’t well watched/maintained.
Ms. Seymour can look at things in the URL to find out more about the phisher (and to help learn for suspicious patterns). She can also find the IP address to do further research. Additionally, she can leverage the Internet Archive (aka the Way Back Machine) to see if the website has changed a lot recently (shows evidence of takeover).
She pays attention to referrers to their website – if a new referrer shows up quickly in their logs and then disappears? It’s likely a phishing site – so then she has to watch the accounts that logged in through there for suspicious activity (in addition to doing further research on the referring site).
It’s not as simple as blocking IPs – she can’t control your personal machines… and all of the places you might be coming from.
She needs to work with ISPs to block known phishing websites, but ISPs are spread all over the world She can watch logs, traffic analysis and referrers – but the phishers are constantly coming up with new ways of doing this. Would be great to work with email providers to get them to watch out for this – but too diverse (some email providers are trying to address this, but difficult to coordinate).
Advice? Watch your statements, watch your statements, watch your statements!