pycharlockholmes

Character encoding detecting library for Python using ICU and libmagic. Based on Ruby implementation https://github.com/brianmario/charlock_holmes and work of https://github.com/xtao/PyCharlockHolmes


License
Zed
Install
pip install pycharlockholmes==0.0.4

Documentation

pycharlockholmes

Build Status

Character encoding detecting library for Python using ICU and libmagic. Inspired by Charlock Holmes

Dependency

  1. icu
  2. file(libmagic)

Gentoo

emerge -av dev-libs/icu
emerge -av sys-apps/file

Ubuntu

apt-get install libicu-dev
apt-get install libmagic-dev

Brew

brew install icu4c
brew install libmagic
export ICUI18N="/usr/local/Cellar/icu4c/xx" # Replace "xx" as the version of your icu
export MAGIC="/usr/local/Cellar/libmagic/xx" # Replace "xx" as the version of your libmagic

Install

python setup build
python setup install

Usage

from charlockholmes import detect
file = open('test.txt')
content = file.read()
print detect(content)

License

Modified BSD License