A new open-source web crawler has recently been unveiled to serve the Tor community. The crawler, called Fresh Onions, is designed for indexing hidden services on the Tor network.
Unlike other Tor indexing tools, this crawler is accurate and extremely easy to set up. Not only does it share relevant statistics and insights about the hidden services, but it also keeps you updated with news about the Tor network and the Tor Project.
In addition to this, by providing their search engine for the deep web, the tool makes hidden services accessible for everyone who’s browsing onion links. It does not restrict its services to a specific client base (early adopters).
The Fresh Onions crawler is hosted at zlal32teyptf4tvi.onion.*
Features At A Glance
Fresh Onions comes with a range of features that you wouldn’t normally find in common Tor indexing tools:
- It is not only useful for finding new hidden services on the Tor network, but is equally handy in finding hidden services from several hundreds of clearnet sources.
- The tool is backed by a search engine that offers an optional elastic search support for full text.
- It also marks the clone websites of the /r/ darknet superlist.
- Additionally, it can locate the SSH fingerprints across these hidden services. This means you can automatically find the IP address of the service using the same signature.
- Fresh Onions is also handy in finding the email addresses and Bitcoin addresses across the hidden services.
- In addition to the given features, this tool will show both the incoming and outgoing links to the onion domains.
- It is also up to date with the status of the hidden service. This means you can always use this tool to find out if a particular service is still operating or is already dead.
- Fresh Onions is designed to act as a port-scanner and a search engine for browsing “interesting” URL paths.
- It will also detect 404 errors right away.
- The tool is extremely effective for automatically detecting language and fuzzy clones. For the fuzzy clone feature, however, you will need elastic search.
With these unique features, Fresh Onions shows promise to potentially become one of the best Tor indexing tools out there.
Infrastructure & Dependencies
Currently, Fresh Onions runs on two different servers—a frontend host that runs the database and the website for the hidden services, and a backend host that runs the crawler in general.
In terms of dependencies, the tool requires python and Tor for its proper operation. If you’re looking to install it on pip, then you would need the requirements.txt file of pip.
Installing the Tor IndexingTool
In order to install the tool, simply follow the steps below:
- You will first have to create a mysql database from the schema.sql.
- Right after this, edit the etc/database for setting up your current database.
- Finally, edit your etc/proxy for setting up the Tor. In this case, you must add the following code:
- script/push.sh youroniondirectory.onion
- script/push.sh newoniondirectory.onion
- Now, edit the etc/uwsgy_only and set up your base directory (BASEDIR) to the location where your Tor scarper is installed (for instance, home/user/torscraper).
- Once you’ve added these codes, run it by entering the following codes:
- init/scraper_service.sh # is the code that will help you start crawling.
- init/is_upservice.sh # will keep the status of the site updated.
- ElasicSearch for full text search.
This Tor scraper also comes with an optional elastic search capability which is enabled on the tool by default:
- You can always edit etc/elasticsearch for setting it.
- At the same time, you can also run ELASTICSEARCH_ENABLED=false for disabling it.
- Finally, run scripts/elasticsearch_migrate.sh for performing the initial set up after the first configuration.
- If you disable the elastic search, you will no longer be able to access the full text search. This option, however, will not stand as a hindrance in crawling the new hidden services that are actually running.
Licensing for Tor Indexing Tool
The Fresh Onionscrawler comes under the GNU Affero GPL 3 License. This means you can always position the tool as a part of your existing software, which is available to the public. But in order to that, you will have to make its source code accessible without any modifications.
* Editor’s note: Before visiting any darknet site, double check with a trusted source to ensure you’re using the exact URL. Dark web URLs change often, so you should always confirm a link is verified before navigating to it. In this case, the developers of Fresh Onions have posted the correct URL on GitHub.