You are your Metadata: Identification and Obfuscation of Social Media Users using Metadata Information

More research that shows how precise identification cane be on metadata alone.

We demonstrate that through the application of a supervised learning algorithm, we are able to identify any user in a group of 10,000 with approximately 96.7% accuracy. Moreover, if we broaden the scope of our search and consider the 10 most likely candidates we increase the accuracy of the model to 99.22%. We also found that data obfuscation is hard and ineffective for this type of data: even after perturbing 60% of the training data, it is still possible to classify users with an accuracy higher than 95%.

I still don’t think this message is widely understood. When people are told that only metadata is kept on their activities, they assume some level of anonymity. You should assume none.

📌 Posted on July 12, 2018

