Imagine that your employer reviewed your tweets on Twitter or posts on other social media around the Internet in the process of doing your performance review. Would you have any reason to be concerned about the outcome? What about potential employers reviewing your online remarks when deciding whether to hire you? Or how about the reviewers at the financial institution that you are applying for a mortgage or loan or credit card from? Or maybe insurance company investigators determining your rates or whether you are even a suitable candidate for insurance? Or how about legal investigators working for the law firm hired by the person you accidentally hit at a traffic stop last month? Or maybe the police officer sitting in his squad car gathering data on you as he decides whether to give you a warning or to write that ticket?
If your online presence is restricted purely to bland, pedestrian remarks and retweets about your niece, the waitstaff at local restaurants, or kitten videos you’re probably fine. Move on. But if you tend to own and express your own beliefs on social media accounts–especially if those beliefs are outside the mainstream–take heed. Your user ID of “froggyvomit” or logging in using Tor are not going to protect you. This, according to the results of a recent study revealing the alarming accuracy of programs gathering and utilizing metadata of online users…
*Overview of Findings
Researchers at University College London and the Alan Turing Institute found they could correctly identify a Twitter user from a group of 10,000 with 96.7 percent accuracy, using just their tweets and publicly available metadata.
The goal was “to determine if the information contained in users’ metadata is sufficient to fingerprint an account,” and the results reveal how much identifying information is tied to Twitter accounts, whose users may believe they are tweeting anonymously. A single tweet contains about 144 fields of metadata.
Researchers took 14 pieces of metadata from 5 million Twitter accounts – including the date the account was created, its followers, the accounts it follows and the tweets it likes – and ran it through three machine-learning algorithms. The researchers found the most basic algorithm had the most accuracy.
The methods of identifying users could be used if an account changes its name, if a user has created multiple accounts or to tell if legitimate accounts have been taken over by malicious users.
The researchers also found obfuscation strategies are ineffective, as even when 60 percent of the data was muddled or altered, the user was able to be classified with an accuracy of more than 95 percent. (When the researchers widened their scope and searched for the 10 most likely candidates, their accuracy was 99.22 percent.)
Users can be identified on nearly any social media account
If Twitter is not your thing it does not matter. You can be identified on the other popular social media platforms as well…
*While the study uses Twitter as its subject, its authors note “the methods presented in this work are generic and can be applied to a variety of social media platforms with similar characteristics in terms of metadata.”
Study (pdf): You are your Metadata: Identification and Obfuscation of Social Media Users using Metadata Information
*Source: Metadata can identify even secret users ‘with 96.7% accuracy’