A Python library for parsing SAMI files.
This project was once theeluwin/PySAMI, which was forked from g6123/PySAMI, thus the license is preserved as @g6123's.
pip install samitizer
To use the automatic charset detection feature, you need to install the uchardet
too.
sudo apt-get install uchardet
from samitizer import Sami
# Using `encoding=None` will invoke the `uchardet` with a subprocess call.
# Tip: try `encoding='cp949'` if nothing works.
sami = Sami('sample.smi', encoding=None)
# These `subtitles` are intances of the `samitizer.Subtitle` class.
print(sami.subtitles[0].lang2content['KRCC'])
vtt_text = sami.convert('vtt', lang='KRCC')
plain_text = sami.convert('plain', lang='KRCC')
Testing requires some additional packages (flake8
is optional though).
pip install nose nose-exclude flake8 coverage
You can test with the nose
nosetests --config=.noserc
or, with docker.
docker build -t samitizer -f Dockerfile .
docker run samitizer