Speech Emotion Recognition models and training using PyTorch


Keywords
speech-emotion-recognition, speech-processing, tensorflow
License
Apache-2.0
Install
pip install vistec-ser==0.4.0a3

Documentation

Vistec-AIS Speech Emotion Recognition

python-badge tensorflow-badge license

Upload Python Package Training

Code Grade Code Quality Score

Speech Emotion Recognition Model and Inferencing using Tensorflow 2.x

Installation

From Pypi

pip install vistec-ser

From source

git clone https://github.com/tann9949/vistec-ser.git
cd vistec-ser
python setup.py install

Usage

Train with Your Own Data

We provide Google Colaboratory example for training Emo-DB dataset using our repository.

VISTEC-depa Thailand Artificial Intelligence Research Institute

Preparing Data

To train with your own data, you need to prepare 2 files:

  1. config.yml (see an example in tests/config.yml) - This file contains a configuration for extracting features and features augmentation.
  2. labels.csv - This will be a .csv file containing 2 columns mapping audio path to its emotion.
    • Your .csv file should contain a header (as we will skip the first line when reading).
    • Currently, we only support 5 emotions (neutral, anger, happiness, sadness, and frustration) if you want to add more, modify EMOTIONS variable in dataloader.py

Preparing a model

Now, prepare your model, you can implement your own model using tf.keras.Sequential or using provided model in models.py.

Training

For training a model, create a DataLoader object and use method .get_dataset to get tf.data.Dataset used for training. DataLoader will also use FeatureLoader which will read config.yml. The dataset will automatically pad a batch according to the longest sequence length.

Inferencing

TODO

Reference

This repository structure was inspired by TensorflowASR by Huy Le Nguyen (@usimarit). Please check it out!

Author & Sponsor

VISTEC-depa Thailand Artificial Intelligence Research Institute

Chompakorn Chaksangchaichot

Email: chompakornc_pro@vistec.ac.th