Academic Torrents Python API

pip install academictorrents==2.3.2


Academic Torrents Python API

Build Status codecov

This repository is an implementation of the BitTorrent protocol written in Python and downloadable as a pip module. You can download datasets from in two lines of code:

import academictorrents as at
path_of_giant_dataset = at.get("323a0048d87ca79b68f12a6350a57776b6a3b7fb") # Download mnist dataset

For people who want to download datasets

we are compatible with Python versions: 2.7, 3.4, 3.5, 3.6

To install: pip install academictorrents

This package works with the academictorrents tracker. You can add a hash from for your torrent, and download datasets into your project.

Here is a little example (it is implemented in examples/ in case you want to play with our source code):

# Import the library
import academictorrents as at

# Download the data (or verify existing data)
filename = at.get("323a0048d87ca79b68f12a6350a57776b6a3b7fb")

# Then work with the data
import pickle, gzip
import sys, os, time

mnist =, 'rb')
train_set, valid_set, test_set = pickle.load(mnist, encoding='latin1')

For Contributors to the AcademicTorrents Python client


We use Github issues and pull requests to manage development of this repository. The following is a guide for setting up the codebase to contribute PRs or try to debug issues.


Getting setup to work on this client is pretty easy. First, install the source code, then install the dependencies with pip:

git clone
cd at-python
pip install -r requirements.txt



We have a test suite that you can run with pytest -s tests/. These tests also run on travis after every push to github. Some of our tests are empty -- usually in parts of the codebase that have been changing quickly -- but we should continue increasing our coverage. If you want to just run one little download, use python examples/ (example code above).


The academictorrents module only has one "public" function, .get. This function checks for the torrent data on the filesystem. If it is not present, then it initiates a Client to download the data for us.

Client is our main thread, it manages the lifecycle of our other threads including Tracker, PeerManager, PieceManager, and many WebSeedManager threads. The Client thread uses the PieceManager class (not thread) to keep track of all the pieces of data. The main thread makes socket requests to Peers, and enqueues jobs to request data from HttpPeers. Socket responses are handled by PeerManager, whereas a fleet of WebSeedManagers threads handles the HttpPeers.