countrywrangler

A library that simplifies the handling of country-related data. Easily standardize your data according to ISO 3166-1 and make it consistent across your project.


Keywords
iso-3166, iso-3166-1, normalize, countries-data, country, data-normalization, data-cleaning
License
MIT
Install
pip install countrywrangler==0.0.8

Documentation

Contributors Forks Stargazers Issues MIT License LinkedIn


Logo

CountryWrangler

CountryWrangler is a Python library that simplifies the handling of country-related data by converting country codes, names, TLDs, phone numbers, timezones, currencies, and languages to proper ISO 3166-1 Alpha-2 country codes. With CountryWrangler, you can easily standardize your data and make it consistent across your project. The library is designed for speed and efficiency, making it easy to process large datasets in no time.

Explore the docs »

Report Bug · Request Feature

About The Project

CountryWrangler is a high-performance Python library that has taken inspiration from pycountry and has surpassed it in terms of speed. With CountryWrangler, converting an Alpha-3 to an Alpha-2 (USA -> US) country code takes a mere 8900 nanoseconds, while the same conversion with pycountry on an Intel i7-10700K CPU @ 3.80GHz takes 282.636.700 nanoseconds.

While pycountry is primarily designed to serve as a database for ISO standards, CountryWrangler is specifically developed to normalize country data. Both libraries cater to their respective use cases. Additionally, CountryWrangler offers extra functions that are designed to handle messy country data with ease.

(back to top)

Getting Started

Installation

Binary installers for the latest released version are available at the Python Package Index (PyPI)

pip install countrywrangler

(back to top)

Basic Usage

Country Name to Alpha-2

name_to_alpha2 takes in a string and searches for a corresponding alpha-2 code in the database for both common and official country names in 34 different languages. If no match is found, None is returned.

Full documentation: https://countrywrangler.readthedocs.io/en/latest/normalize/country_name/

import countrywrangler as cw

alpha2 = cw.Normalize.name_to_alpha2("Germany")
print(alpha2)

>>> DE

The use_fuzzy=True option captures and matches virtually all variations of country names. Although using fuzzy lookup may incur a significant performance cost of approximately 100x slower than the normal lookup.

import countrywrangler as cw

alpha2 = cw.Normalize.name_to_alpha2("Germany Federal Republic of", use_fuzzy=True)
print(alpha2)

>>> DE

Country Code to Alpha-2

code_to_alpha2 converts both alpha-3 and alpha-2 codes to alpha-2 format, and returning None in the absence of a match. This can also be used to validate if a given string is a country code.

Full documentation: https://countrywrangler.readthedocs.io/en/latest/normalize/country_code/

import countrywrangler as cw

alpha2 = cw.Normalize.code_to_alpha2("GBR")
print(alpha2)

>>> GB

Phone Number to Alpha-2

phone_to_alpha2 accepts a string or integer representing a phone number in international format (E.164) and returns the corresponding ISO-3166-1 alpha-2 country code of the phone number's origin. If the input is not a valid phone number, the function returns None.

Warning Please ensure that the input provided is a valid phone number, as almost any numerical input can be matched to an alpha-2 country code. This function does not validate whether the input is a phone number.

Full documentation: https://countrywrangler.readthedocs.io/en/latest/normalize/phone/

import countrywrangler as cw

alpha2 = cw.Normalize.phone_to_alpha2("+1 (222) 333-4444 ")
print(alpha2)

>>> US

TLD to Alpha-2

tld_to_alpha2 retrieves the country code associated with a given Top-Level Domain (TLD). If a match is found, the function returns the country code in ISO-3166-1 alpha-2 format. Otherwise, it returns None.

Full documentation: https://countrywrangler.readthedocs.io/en/latest/normalize/tld/

import countrywrangler as cw

alpha2 = cw.Normalize.tld_to_alpha2(".co.uk")
print(alpha2)

>>> GB

Timezone to Alpha-2

timezone_to_alpha2 takes a geographic timezone name such as Europe/Vienna and returns the corresponding alpha-2 country code e.g., AT if it's an exact match. If there's no exact match, the function returns None instead.

Full documentation: https://countrywrangler.readthedocs.io/en/latest/normalize/timezone/

import countrywrangler as cw

alpha2 = cw.Normalize.timezone_to_alpha2("Europe/Vienna")
print(alpha2)

>>> AT

Language to Alpha-2

language_to_alpha2 matches ISO 639-1, ISO 639-2 language codes and IETF language tags to an ISO-3361-1 Alpha-2 country code. It is important to note that while IETF language tags will always be unambiguous, ISO codes may not be. For instance, the code ES can produce a list of country codes corresponding to all countries where Spanish is spoken.

Warning If it is not desired that ambiguous country codes are being returned as a list, the option allow_ambiguous=False can be passed as a parameter. This will restrict the output to a single, unambiguous country code. I case matching ambiguous countries is not turned off the function either returns a string (uambiguous) or a list (ambiguous), you code must be able to handle the different types.

Full documentation: https://countrywrangler.readthedocs.io/en/latest/normalize/language/

import countrywrangler as cw

alpha2 = cw.Normalize.language_to_alpha2("en-US")
print(alpha2)

>>> US

(back to top)

Roadmap

  • Fuzzy lookup for country names
  • Support language code to to alpha2
  • Option to exclude languages from country name matching
  • Support for subdivisions to alpha2
  • Support for city to to alpha2
  • Add more alternative country names

See the open issues for a full list of proposed features (and known issues).

(back to top)

Data Update Policy

Updates to the ISO 3361 standards are monitored and immediately added upon changes or additions. No changes to the data will be accepted into CountryWrangler unless it is plain simple wrong.

(back to top)

Contributing

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks you!

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

(back to top)

License

CountryWrangler is Open Source and distributed under the MIT License.

Copyright (c) 2023 Henry Wills - https://linktr.ee/thehenrywills

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

(back to top)

Contact

Henry Wills - Linktree - @TheHenryWills

Project Link: https://github.com/TheHenryWills/CountryWrangler

(back to top)

Donations / Monetary Support

Your donation helps support my work in creating high-quality content such as blog articles and tutorials, as well as maintaining my open-source projects. Every penny you donate goes directly towards these efforts, ensuring that these resources remain accessible and free for everyone.

Buy Me a Coffee at ko-fi.com

(back to top)

Acknowledgments

(back to top)