The Tor Project enforces strict guidelines and rules on the usage of its anonymous network.
That’s why it banned the Tor relay node of a university research group in Brazil following the discovery that the group was harvesting the .onion addresses of Tor users.
One of the researchers, Marcus Rodrigues, posted to a Tor mailing list seeking to explain the exact mechanisms and purpose of the research. If that worries you please see our guide on TOR and learn the correct way to use it for better anonymity.
Rodrigues is a junior researcher in the Laboratory of Security and Cryptography team from the University of Campinas in Sao Paulo.
He went on to explain that the research group was developing an online tool that could differentiate between benign and malicious hidden services.
The team then decided to look into Tor addresses and harvesting large volumes of their web pages.
Rodrigues has stated that his research focused mainly on techniques that would automatically index malicious hidden services, such as malware hosting services and drug dealing sites, based on the content hosted therein.
He said the researchers had planned to publish an academic paper detailing real-time statistics on the types of malicious programs that are operational on the dark web.
The team was also working to develop a platform that internet users could employ to determine the Tor websites that contained malicious content before accessing them.
Rodrigues readily revealed the techniques that they were using to achieve this.
He stated that they modified the Tor node to gather specific information about the hidden services.
He was quick to make the assurance that the data collected could not in any way reveal the identity of the user or Tor service.
The data collected includes the sites’ .onion addresses, the sites’ popularity and technical information that would not allow the researchers to unmask or negatively affect the Tor hidden service.
In the Tor mailing list post, Rodrigues admitted that the group’s relay was collecting .onion addresses and duly apologized for the violation of the ethical guidelines set forth by the Tor Project.
He explained that they were utilizing a web crawler to collect the data required for their research.
The researchers could not obtain certain information through web crawlers.
This data included the size of the Tor network, the number of Tor hidden services that run HTTP protocol, the number of services that run other protocols and the type of protocols run by these hidden services.
The researchers needed to harvest .onion addresses in order to access some of this information.
They would run a port scan of the addresses they collected that were running a web server.
This enabled the researchers to download the index pages of the addresses.
Rodrigues added that they also attempted to determine the longevity of the addresses they managed to harvest.
According to Rodrigues, the researchers had the intention of deleting the Tor addresses they had collected over the course of their research.
The data obtained from specific addresses would not be disclosed.
In addition, the researchers only targeted a random sample of addresses rather than specific ones.
These actions led to their Tor relay being banned by the Tor Project admins.
The Tor administrators clearly stated that the researchers’ activities were a direct violation of the ethical research guidelines put forth by the Tor Project.
Harvesting of .onion addresses is listed as an example of unacceptable research activity in these guidelines.
This is in line with the general principles behind the ethical guidelines.
Since the data collected by the researchers would not be acceptable to publish as they themselves stated, it was not ethical for them to collect it.
Rodrigues offered to make the process transparent in a bid to continue running their Tor relay.
He added that operating the Tor relay was essential for future research at the university.
But the Tor relay node is still offline, and whether the Tor Project administrators will unban the relay is uncertain.
The team stated that they can still conduct their research through other methods, though they are less effective.