PyThaiASR

Python Thai Automatic Speech Recognition

PyThaiASR is a Python package for Automatic Speech Recognition with focus on Thai language. It have offline thai automatic speech recognition model.

License: Apache-2.0 License

Google Colab: Link Google colab

Model homepage: https://huggingface.co/airesearch/wav2vec2-large-xlsr-53-th

Install

pip install pythaiasr

For Wav2Vec2 with language model: if you want to use wannaphong/wav2vec2-large-xlsr-53-th-cv8-* model with language model, you needs to install by the step.

pip install pythaiasr[lm]
pip install https://github.com/kpu/kenlm/archive/refs/heads/master.zip

Usage

from pythaiasr import asr

file = "a.wav"
print(asr(file))

API

asr(data: str, model: str = _model_name, lm: bool=False, device: str=None, sampling_rate: int=16_000)

data: path of sound file or numpy array of the voice
model: The ASR model
lm: Use language model (except airesearch/wav2vec2-large-xlsr-53-th model)
device: device
sampling_rate: The sample rate
return: thai text from ASR

Options for model

airesearch/wav2vec2-large-xlsr-53-th (default) - AI RESEARCH - PyThaiNLP model
wannaphong/wav2vec2-large-xlsr-53-th-cv8-newmm - Thai Wav2Vec2 with CommonVoice V8 (newmm tokenizer)
wannaphong/wav2vec2-large-xlsr-53-th-cv8-deepcut - Thai Wav2Vec2 with CommonVoice V8 (deepcut tokenizer)

You can read about models from the list:

Docker

To use this inside of Docker do the following:

docker build -t <Your Tag name> .
docker run docker run --entrypoint /bin/bash -it <Your Tag name>

You will then get access to a interactive shell environment where you can use python with all packages installed.

pythaiasr
Release 0.2

Release 0.2

1.3.0

1.1.1

1.1.2

1.2.0

1.0.0

1.0.1

1.1.0

0.3

0.2

0.1

Documentation

PyThaiASR

Install

Usage

API

Docker

Stats

Development practices

Releases

Contributors

pythaiasr Release 0.2

Release 0.2 Toggle Dropdown 1.3.0 1.1.1 1.1.2 1.2.0 1.0.0 1.0.1 1.1.0 0.3 0.2 0.1

Documentation

PyThaiASR

Install

Usage

API

Docker

Stats

Development practices

Releases

Contributors

pythaiasr
Release 0.2

Release 0.2

1.3.0

1.1.1

1.1.2

1.2.0

1.0.0

1.0.1

1.1.0

0.3

0.2

0.1