text_comparer

Metric for comparing text


License
BSD-3-Clause
Install
pip install text_comparer==0.0.2

Documentation

Text Comparer

Uses cosine similarity to give a numerical evaluation of the similarity of two texts (0 to 1).

This code has a companion blog-post here: http://engineering.aweber.com/cosine-similarity/

Sample Usage

In [1]: from vectorizer import compare_texts

In [2]: compare_texts('Mary had a little shotgun.', 'Mary loves her shotgun')
Out[2]: 0.66666666666666663

In [3]: compare_texts('John loves Mary.', 'But Mary has a shotgun.')
Out[3]: 0.33333333333333331

The higher score in 2 implies that the first two sentences are more similar than the second two. A classic tale of the love-linked-list.