hotword_detection

Python Hotword Detection


License
Other
Install
pip install hotword_detection==1.2

Documentation

HotWord Detection

This library provides functionality for detecting a hotword in given audio file using MFCC features and Dynamic Time Warping (DTW) pattern matching algorithm.

Installation

This project is on pypi

To install from pypi:

pip install hotword_detection

From this repository:

git clone https://github.com/sakethgsharma/HotWordDetection.git
python setup.py install

Usage

Example scripts

For training a hotword, run:

python bin/trainHotword.py

For testing, run:

python bin/checkHotword.py

Supported features

  • Mel Frequency Cepstral Coefficients
  • Choice of selecting any suitable hot word through appropriate training paradigm
  • Supports variable sampling frequencies
  • Amplitude based Voice Activity Detector(VAD) used during recordings to remove extraneous noise
  • Personalization using automatic DTW thresholding

MFCC Features

MFCC vectors are used in this module since they are the most commonly extracted features used for speech recognition systems.

Parameter Description
alpha Parameter used in pre-emphasis filtering. Should be any value between 0 and 1.
N Number of FFT points.
fs Sampling frequency of stored audio file.
frame_dur Duration of 1 speech frame.
num_filters Number of filters used in the Mel filterbank.
lower_freq Lower frequency bound used for constructing filterbank.
upper_freq Upper frequency bound used for constructing filterbank. Should be less than fs/2.

Dynamic Time Warping

Dynamic time warping (DTW) is an algorithm for measuring similarity between two temporal sequences which may vary in speed.

Reference

MFCC tutorial DTW Wiki