geocoding

Geocoding application


Keywords
geonames, reverse-geocoding
License
BSD-3-Clause

Documentation

Geocoding does not rely on any external API. It relies on an internal (huge) database of 156710 coordinates corresponding to cities of more than 500 inhabitants (see Data sources for more details).

Release notes

  • (v0.3.1): add a script to rebuild cities.txt internal database with up-to-date data from https://geonames.org
  • (v0.3.0): lookup city information from country and name
  • (v0.2.0): add distance function
  • (v0.1.0): reverse geocoding function

Compile, test and try

Compilation:

rebar compile

Tests:

rebar eunit

Usage:

> erl -pa _build/default/lib/geocoding/ebin/ 
Erlang/OTP 23 [erts-11.1.8] [source] [64-bit] [smp:12:12] [ds:12:12:10] [async-threads:1] [hipe] [dtrace]

Eshell V11.1.8  (abort with ^G)
1> application:start(geocoding).
ok

2> geocoding:reverse(48.857929, 2.346707).
{ok,{europe,fr,<<"Paris">>,525.451956}}

3> geocoding:distance({48.857929, 2.346707}, {40.7630463, -73.973527}).
5832947

4> geocoding:lookup('FR', <<"paris">>).
{ok,{2988507,{48.85341,2.3488},europe,'FR',<<"Paris">>}}

Installation

Erlang (rebar3):

{deps, [
  {geocoding, "0.3.0"}}
]}

Elixir (mix):

defp deps do
  [
    {:geocoding, "~> 0.3.0"}
  ]
end

Database rebuilding

An escript allows to rebuild your cities.txt data from geonames. It will fetch the last version of cities500.zip, unzip it, filter out city with too low population, and build continent name. It will generate a cities.txt.new file. Existing citiers.txt will not be erased.

> ./scripts/cities_builder.escript

If you want to replace your cities.txt, add --replace parameter. Old cities.txt will be renamed to cities.txt.old.

> ./scripts/cities_builder.escript --replace

Technical information

Algorithm

Reverse geocoding is done thanks to a k-d tree algorithm. We use Martin F. Krafft implementation. Original source code is here: https://github.com/kbranigan/libkdtree/tree/master/kdtree%2B%2B . It is embeded in an erlang driver.

From latitude/longitude coordinates, geocoding finds the nearest point in our locations database. Associated data are returned, as well as a distance between provided coordinates and real coordinates. Since reverse geocoding relies only on coordinates, strange behaviour may occured when a big city is near a small one: a point inside the big city and near its border may be associated to the small city because the small city coordinates will be nearest than the big city coordinates.

Data sources

All locations data come from https://www.geonames.org database (http://download.geonames.org/export/dump/cities500.zip - 2021-03-27) :

  • Locations without population have been excluded.
  • Fields have been reduced to: geonameId, latitude, longitude, country code (ISO-3166), standard name. A 6th fields have been added between longitude and country code: continent. Each field is separated by a tabulation.

Example:

2988507 48.85341        2.3488  europe  FR      Paris

Each location is associated to one of the following continents:

  • africa (africa and islands nearby like Madagascar, Canary, La Réunion...)
  • antarctica (only one country)
  • asia (including russian cities after Ural Mountains)
  • europe (including russian cities before Ural Mountains)
  • oceania (including Australia continent and pacific islands)
  • north america
  • south america

Missing locations may be added on request.