Tools for real time speech processing, keyword spotting

pip install pyrtstools==0.2.9



Python Real Time Speech Tools is a collection of classes designed to develop a real-time speech processing pipeline for voice user interface.

Disclaimer: This is an early version designed to provide a voice command detection pipeline for LinTO. However the elements are designed to be generic and can be used for other purposes.


pyrtstools features different blocks:

  • Audio acquisition
  • Voice activity detection
  • Feature extraction
  • Keyword spotting

All the element are designed to be easy to use and easy to interconnect.


In order to install the package you need python3 and pip/setuptools installed.

Recquired libraries are:

  • portaudio19-dev (For pyaudio microphone input)

The python dependecies are automaticly installed. (Note that it may takes some time as some of them -numpy, tensorflow- are faily large)


sudo pip3 install pyrtstools

From source

git clone
cd pyrtstools
sudo ./ install

Note for installation on ARM

pyrtstools requires tensorflow>=2.0.0, however wheels for arm stops at 1.14 on pywheels & pypi. You must install tensorflow-2.0.0 using the compiled wheel prior to installing pyrtstools. .whl file can be found here

pip install tensorflow-2.0.0-cp37-none-linux_armv7l.whl


Here are a simple pipeline designed to detect hotword from microphone.

import pyrtstools as rts

def on_detect(i, v):
    print("Detected keyword {} with confidence {}".format(i, v))

audioParam = rts.listenner.AudioParams() # Hold signal parameters
listenner = rts.listenner.Listenner(audioParam) # Microphone input
btn = rts.transform.ByteToNum(normalize=True) #Convert raw signal to numerical
featParams = rts.features.MFCCParams() # Hold MFCC features parameters
mfcc = rts.features.SonopyMFCC(featParams) # Extract MFCC
kws = rts.kws.KWS("/path/to/your-model") # Hotword spotting
kws.on_detection = on_detect # On keyword detection. 
pipeline = rts.Pipeline([listenner, btn, mfcc, kws]) # Holds elements and links them
pipeline.start() # Start all the elements
    listenner.join() # Wait for the microphone to finish (To block the execution)
except KeyboardInterrupt:

Every block is located in a subpackage:

  • Audio acquisition: pyrtstools.listenner
  • Voice activity detection: pyrtstools.vad
  • Features extraction: pyrtstools.features
  • Keyword spotting: pyrtstools.kws
  • Signal transformation: pyrtstools.transform

Every element and class is documented.


This project is under aGPLv3 licence, feel free to use and modify the code under those terms. See LICENCE

