A ocr tool for traditional chinese


License
Apache-2.0
Install
pip install simpleocr==0.0.23

Documentation

Simpleocr library

Simpleocr is a traditional chinese OCR python package that based on deep learning method.

The library consists of text localization and text recognition.

Text localization

The model is a reimplementation of CRAFT(Character-Region Awareness For Text detection) by tensorflow.

paper | github

Text recognition

The reimplementation is based on CRNN model that RNN layer is replaced with self-attention layer.

CRNN

paper

Self attention

paper

Installation

$ pip install simpleocr

or

$ git clone https://github.com/xianyuntang/simpleocr
$ cd simpleocr
$ python setup.py install

Usage

from simpleocr import ocr
ocr.get_text(['image.jpg'])

TODO

  1. English support
  2. GPU support