Avesta

Tools for text and language processing

This is a library for text and language (pre) processing.

Installation

pip install avesta

Usage

Similarity

Lexical Similarity

from avesta.tools.similarity.lexical_sim import lexical_synonym_checker

status = lexical_synonym_checker("پیراهن مردانه سایز 12 قرمز", "پیراهن قرمز       مردانه سایز ۱۲")
print(status)
# Yes (they are lexically synonyms.)

Semantic Similarity

from avesta.tools.similarity.semantic_sim import semantic_synonym_checker

status = semantic_synonym_checker("پیراهن مردانه مشکی", "پیراهن سیاه مردانه")
print(status)
# Yes (they are semantically synonyms.)

Character based Similarity

# Character based similarity gives the distance between two strings. 

from avesta.tools.similarity.cbs import similarity

sim = similarity()
status = sim.char_based_similarity("avesta", "a vesta", threshold=1)
print(status)
# True (The distance is less than or equal to threshold.)

Handle whitespace mistakes as part of spell checking

# Character based similarity gives the distance between two strings. 

from avesta.tools.spell_checker.whitespace_handler import correct_spacing

print(correct_spacing("مانتوزنانه"))
# 'مانتو زنانه'

Stats

Dependencies

Dependent packages

Dependent repositories

Total releases

Latest release

Sep 23, 2024

First release

Aug 23, 2023

Stars

Forks

Watchers

Contributors

Repository size

11.6 MB

SourceRank

avesta
Release 0.10

Release 0.10

0.19

0.16

0.15

0.14

0.13

0.12

0.11

0.10

0.9

0.8

Documentation

Avesta

Stats

Releases

Contributors

avesta Release 0.10

Release 0.10 Toggle Dropdown 0.19 0.16 0.15 0.14 0.13 0.12 0.11 0.10 0.9 0.8

Documentation

Avesta

Stats

Releases

Contributors

avesta
Release 0.10

Release 0.10

0.19

0.16

0.15

0.14

0.13

0.12

0.11

0.10

0.9

0.8