Python library for word stress detection

pip install russ

git clone https://github.com/IlyaGusev/russ
cd russ
pip install -r requirements.txt
python setup.py

Colab: link

from russ.stress.predictor import StressPredictor

model = StressPredictor()
model.predict("корова")

>>> [3]

Script for downloading datasets:

ru_custom.txt: 885 words
zaliznyak.txt: 86839 lexemes
espeak.txt: 804909 words
ruwiktionary-20221201-pages-articles.xml: articles from ruwiktionary, update to a new dump

Preparing data for training

Argument	Default	Description
--wiktionary-dump-path	None	path to downloaded wiktionary dump
--espeak-dump-path	None	path to espeak dump
--custom-dict-path	None	path to file with custom words
--inflected-dict-path	None	path to downloaded file with lexemes
--inflected-sample-rate	0.3	part of inflected dict to use
--split-mode	lexemes	how to split into train, val and test files: "sort", "lexemes" or "shuffle"
--all-path	data/all.txt	path to output train file
--train-path	data/train.txt	path to output train file
--val-path	data/val.txt	path to output validation file
--test-path	data/test.txt	path to output test file
--val-part	0.05	part of validation file
--test-part	0.05	part of test file
--lower	Fasle	lowercase all words

russ
Release 0.0.2