PyRTSTools

Introduction

Python Real Time Speech Tools is a collection of classes designed to develop a real-time speech processing pipeline for voice user interface.

Disclaimer: This is an early version designed to provide a voice command detection pipeline for LinTO. However the elements are designed to be generic and can be used for other purposes.

Features

pyrtstools features different blocks:

Audio acquisition
Voice activity detection
Feature extraction
Keyword spotting

All the element are designed to be easy to use and easy to interconnect.

Installation

In order to install the package you need python3 and pip/setuptools installed.

Recquired libraries are:

portaudio19-dev (For pyaudio microphone input)

The python dependecies are automaticly installed. (Note that it may takes some time as some of them -numpy, tensorflow- are faily large)

pypi

sudo pip3 install pyrtstools

From source

git clone https://github.com/linto-ai/pyrtstools.git
cd pyrtstools
sudo ./setup.py install

Note for installation on ARM

pyrtstools requires tensorflow>=2.0.0, however wheels for arm stops at 1.14 on pywheels & pypi. You must install tensorflow-2.0.0 using the compiled wheel prior to installing pyrtstools. .whl file can be found here
wget https://github.com/lhelontra/tensorflow-on-arm/releases/download/v2.0.0/tensorflow-2.0.0-cp37-none-linux_armv7l.whl
pip install tensorflow-2.0.0-cp37-none-linux_armv7l.whl

Usage

Here are a simple pipeline designed to detect hotword from microphone.

import pyrtstools as rts

def on_detect(i, v):
    print("Detected keyword {} with confidence {}".format(i, v))

audioParam = rts.listenner.AudioParams() # Hold signal parameters
listenner = rts.listenner.Listenner(audioParam) # Microphone input
btn = rts.transform.ByteToNum(normalize=True) #Convert raw signal to numerical
featParams = rts.features.MFCCParams() # Hold MFCC features parameters
mfcc = rts.features.SonopyMFCC(featParams) # Extract MFCC
kws = rts.kws.KWS("/path/to/your-model") # Hotword spotting
kws.on_detection = on_detect # On keyword detection. 
pipeline = rts.Pipeline([listenner, btn, mfcc, kws]) # Holds elements and links them
pipeline.start() # Start all the elements
try:
    listenner.join() # Wait for the microphone to finish (To block the execution)
except KeyboardInterrupt:
    pipeline.close()

Every block is located in a subpackage:

Audio acquisition: pyrtstools.listenner
Voice activity detection: pyrtstools.vad
Features extraction: pyrtstools.features
Keyword spotting: pyrtstools.kws
Signal transformation: pyrtstools.transform

Every element and class is documented.

Licence

This project is under aGPLv3 licence, feel free to use and modify the code under those terms. See LICENCE

pyrtstools
Release 0.2.9

Release 0.2.9

0.2.9

0.2.8a0

0.2.8

0.2.6

0.2.4a0

0.2.4

0.2.3a0

0.2.3

0.2.2a0

0.2.1rc0

Documentation

PyRTSTools

Introduction

Features

Installation

pypi

From source

Note for installation on ARM

Usage

Licence

Used libraries

Stats

Releases

Contributors

pyrtstools Release 0.2.9

Release 0.2.9 Toggle Dropdown 0.2.9 0.2.8a0 0.2.8 0.2.6 0.2.4a0 0.2.4 0.2.3a0 0.2.3 0.2.2a0 0.2.1rc0

Documentation

PyRTSTools

Introduction

Features

Installation

pypi

From source

Note for installation on ARM

Usage

Licence

Used libraries

Stats

Releases

Contributors

pyrtstools
Release 0.2.9

Release 0.2.9

0.2.9

0.2.8a0

0.2.8

0.2.6

0.2.4a0

0.2.4

0.2.3a0

0.2.3

0.2.2a0

0.2.1rc0