
Geolocation for Twitter

pip install carmen==2.0.0



A Python version of Carmen, a library for geolocating tweets.

Given a tweet, Carmen will return Location objects that represent a physical location. Carmen uses both coordinates and other information in a tweet to make geolocation decisions. It's not perfect, but this greatly increases the number of geolocated tweets over what Twitter provides.

To install, simply run:

$ python install

To run the Carmen frontend, see:

$ python -m carmen.cli --help

Geonames Mapping

Alternatively, locations.json can be swapped out to use Geonames IDs instead of arbitrary IDs used in the original version of Carmen. This JSON file can be found in carmen/data/new.json.

Below are instructions on how mappings can be generated.

First, we need to get the data. This can be found at The required files are countryInfo.txt, admin1CodesASCII.txt, admin2Codes.txt, and cities1000.txt. Download these files and move them into carmen/data/dump/.

Next, we need to format our data. We can simply delete the comments in countryInfo.txt. Afterwards, run the following.

$ python3
$ python3

Then, we need to set up a PostgreSQL database, as this allows finding relations between the original Carmen IDs and Geonames IDs significantly easier. To set up the database, create a PostgreSQL database named carmen and reun the following SQL script:

$ psql -f carmen/sql/populate_db.sql carmen

Now we can begin constructing the mappings from Carmen IDs to Geonames IDs. Run the following scripts.

$ python3 > ../mappings/cities.txt
$ python3 > ../mappings/regions.txt

With the mappings constructed, we can finally attempt to convert the locations.json file into one that uses Geonames IDs. To do this, run the following.

$ python3