Language data store and linguistic query API


Keywords
phonology, corpus, phonetics, acoustics, database, influxdb, neo4j, rest-api, speech-analysis, speech-processing
License
MIT
Install
pip install polyglotdb==0.1.13

Documentation

PolyglotDB

Build Status Coverage Status Documentation Status PyPI version

PolyglotDB is a Python package for storing and querying large speech corpora. It constructs various kinds of database, and has a consistent Python API for interacting with the various underlying databases. The online documentation is available at http://polyglotdb.readthedocs.io/en/latest/.

This package is intended for developers and those experienced with scripting in Python. If you would like to use a graphical interface for querying and interacting with PolyglotDB databases, please see Speech Corpus Tools (http://speech-corpus-tools.readthedocs.io/en/latest/). Speech Corpus Tools is currently depreciated and undergoing significant update to match recent development of PolyglotDB.

Citation

McAuliffe, Michael, Elias Stengel-Eskin, Michaela Socolof, Arlie Coles, Sarah Mihuc, and Morgan Sonderegger (2017). PolyglotDB [Computer program]. Version 0.0.1 (alpha), retrieved 28 July 2017 from https://github.com/MontrealCorpusTools/PolyglotDB.

or

McAuliffe, Michael, Elias Stengel-Eskin, Michaela Socolof, and Morgan Sonderegger (2017). Polyglot and Speech Corpus Tools: a system for representing, integrating, and querying speech corpora. In Proceedings of Interspeech 2017.