CoEDL Kaldi Helpers
A set of scripts to use in preparing a corpus for speech-to-text processing with the Kaldi Automatic Speech Recognition Library.
Read about setting up Docker to run all this.
For more information about data requirements, see the data guide.
Requirements
This pipeline relies on Python 3.6 and several open-source Python packages (listed here). It also assumes you have Kaldi, sox and task installed. We highly recommend using our docker image.
Tasks
This library uses the task tool to run the more complex processes automatically. Once you've set up Kaldi Helpers, you can run the various pipeline tasks we've developed (or out of the box in the docker image). You can read about these tasks here.