Privacy-Preserving Dynamic Learning of Tor Network Traffic

The code to generate a shadow network topology is here and has its own READMEs. What follows is some information regarding the input data specifically.

For the latency portion, here is the final output of our RIPE measurements and data parsing process from 2018. We did not rerun this collection process in 2020.

For the bandwidth portion, we provide here archives as it appeared in 2018 and 2020. Here are the final JSON output files of the parsing process in 2018 and 2020.

If you want to reproduce our results with fresh data, you will need

  1. a MaxMind account to download their GeoIP databases for both the latency and bandwidth parts of topology generation.
  2. to scrape part for the bandwidth part.

GeoIP data

MaxMind does not allow us to provide the databases that we used. What follows is information on how to download the (current) databases for yourself.

Create a MaxMind account here.

After logging in, visit your Account Summary page.

You can now download GeoLite2 databases in two ways: in the browser manually, or via a URL with your license key for scripting.

In the links that follow, you will have to edit the ones with place-holder account number 222222 in them.

Manual with browser

Visit the Download Files page

Click on the Download GZIP link for GeoLite2 City. It should download a .tar.gz.

Click on the Download ZIP link for GeoLite2 City: CSV Format. It should download a .zip.

License key for scripting

Visit the My License Key page Generate a new license key. No it won’t be used for GeoIP Update. The next page will show you your account number and license key. Take note of it. It will look something like nNnNnNnNnNnNnNnN.

You need the GeoLite2 City database and the GeoLite2 City: CSV Format database. Replace YOUR_LICENSE_KEY in the following links to download GeoLite2 City.

# Database URL
# SHA256 URL

And likewise with the following two links for GeoLite2 City: CSV Format.

# Database URL
# SHA256 URL

Speed test data

Verify that still lists reports for a variety of countries and clicking on the links for countries leads to pages with per-city data in tables.

Verify that still lists averages for a wide variety of countries.

The code repo walks you through how to download and parse this data.


Do not expect the data from 2018 to work with scripts in the code repo without some sort of modification: the scripts were organized, documented, and updated in 2020, but the 2018 data is merely provided as-is.

We expect the main pain points with processing 2018 data today to be: