Researchers at the Sapienza University of Rome have developed an algorithm that can detect the activity of an Android app operating through Tor traffic.
Notably, the newly developed algorithm cannot give away user-identifying details such as IP addresses, thus it isn’t a deanonymization script.
However, what the app can detect is whether or not the Tor user is using an Android app.
The researchers hail from Sapienza University’s Department of Computer, Control and Management Engineering.
Last month, they published a study sharing all the details of their Tor findings.
Inside their report [PDF], the researchers have revealed many things among which is their plan to also disclose their algorithm’s code.
The publication adds to the findings of past studies on the subject, which analyzed Tor’s TCP packet flows and categorized traffic into eight different types: email, chat, browsing, video streaming, audio streaming, file transfers, P2P and VoIP.
10 Selected Apps Studied
The computer engineering researchers at Sapienza University of Rome used a similar method like the used by the Canadian Institute for Cybersecurity in the aforementioned research.
They analyzed the TCP packets running through a Tor connection in order to find patterns specific to certain Android apps.
They trained the algorithm they developed with the Tor traffic patterns of 10 different apps: Instagram, Facebook, uTorrent, Skype, the Tor Browser Android app, Spotify, Twitch, YouTube, Replaio Radio and DailyMotion.
Then, they were able to aim the algorithm at Tor traffic and recognize if the user was using one of the 10 apps.
Findings from the Experiment
The results of the test showed a 97.3 percent algorithm accuracy.
Nonetheless, the algorithm they developed isn’t as ideal as it sounds.
For instance, the algorithm can be used only if there isn’t any background traffic, which means it only works when the user is utilizing only one app on their mobile device and nothing else.
If there are too many apps running at the same time in the background, the TCP traffic patterns are jumbled and the efficiency of the algorithm drops.
The accuracy of some results is also problematic. For example, media streaming-based apps like Spotify or YouTube produce similar traffic patterns that can lead to false positives.
There can also be an issue with apps such as Instagram, Facebook and the Tor Browser app, and their lengthy idle periods—as they go through the accessed content, the user activity goes silent.
The researchers plan to experiment with additional apps and machine learning algorithms in the future, and they also stated that they plan to release the code to their algorithm to the public.