fraug

FRAUG (For Realistic AUGmentations) 🐸 library


License
Apache-2.0
Install
pip install fraug==0.0.1

Documentation

FRAUG

The GitHub repository of the FRAUG (For Realistic AUGmentations) 🐸 library!

🚧 WIP

TODO

Methods Sub-method Sub-submethod Interest of the method Pseudo-code for French Pseudo-code for multilingual Rust Example
Lexical substitution Thesaurus Dictionary of synonyms 🟢 🔴 🔴 🔴
WordNet 🟢 🔴 🔴 🔴
Wonef 🟢 🔴 🔴 🔴
Word embedding Gensim (Fauconnier) 🟢 🔴 🔴 🔴
FastText 🟢 🔴 🔴 🔴
Masked language model (BERT like) Random 🟢 🔴 🔴 🔴
POS 🔴 🔴 🔴 🔴
TD-IDF 🔴 🔴 🔴 🔴
Back-translation Marian (Helsinki-NLP models) 🟢 🔴 🔴 🔴
M2M100 🟢 🔴 🔴 🔴
See if other models have appeared since 🔴 🔴 🔴 🔴
Transformation of the text surface Not relevant in French, will have to be done for English 🔴 🔴 🔴
Random noise injection Spelling mistakes injection 🟢 🔴 🔴 🔴
Typing errors injection 🟢 🔴 🔴 🔴
Unigram noise injection 🔴 🔴 🔴 🔴
Noise injection 🔴 🔴 🔴 🔴
Mixed sentences 🔴 🔴 🔴 🔴
Random insertion 🟢 🔴 🔴 🔴
Random swap 🟢 🔴 🔴 🔴
Random deletion 🟢 🔴 🔴 🔴
Cross-over augmentation 🔴 🔴 🔴 🔴
Manipulating the syntax tree Time manipulation 🟠 🔴 🔴 🔴
Gender manipulation 🟠 🔴 🔴 🔴
Number manipulation 🟠 🔴 🔴 🔴
MixUp Word Mix Up 🔴 🔴 🔴 🔴
Sentence Mix Up 🔴 🔴 🔴 🔴
Generative methods Generate paraphrases 🟠 🔴 🔴 🔴
Complexification 🟠 🔴 🔴 🔴
Text simplification Text summary 🟢 🔴 🔴 🔴
Simplification 🟠 🔴 🔴 🔴

If you find the project useful, please consider giving it a star ⭐️.