jaro-winkler

Original, standard and customisable versions of the Jaro-Winkler functions.


License
GPL-3.0-only
Install
pip install jaro-winkler==2.0.0

Documentation

JaroWinkler

PyPI version

Original, standard and customisable versions of the Jaro-Winkler functions.

>>> import jaro
>>> jaro.jaro_winkler_metric(u'SHACKLEFORD', u'SHACKELFORD')
0.9818181
>>> help(jaro)

Help on package jaro:

NAME
    jaro - Python translation of the original Jaro-Winkler functions.

DESCRIPTION
    The Jaro-Winkler functions compare two strings and return a score indicating
    how closely the strings match. The score ranges from 0 (no match) to 1
    (perfect match).

    Two null strings ('') will compare as equal. Strings should be unicode
    strings, and will be compared as given; the caller is responsible for
    capitalisations and trimming leading/trailing spaces.

    You should normally only need to use either the jaro_metric() or
    jaro_winkler_metric() functions defined here. If you want to implement your
    own, non-standard metrics, look at the comments and functions in the jaro.py
    submodule.

PACKAGE CONTENTS
   ...
   jaro
   strcmp95
   ...

FUNCTIONS
    jaro_metric(string1, string2)
        The standard, basic Jaro string metric.

    jaro_winkler_metric(string1, string2)
        The Jaro metric adjusted with Winkler's modification, which boosts
        the metric for strings whose prefixes match.

    original_metric(string1, string2)
        The same metric that would be returned from the reference Jaro-Winkler
        C code, taking as it does into account a typo table and adjustments for
        longer strings.
        ...

    custom_metric(string1, string2, typo_table, typo_scale,
                               boost_threshold, pre_len, pre_scale, longer_prob)
        Calculate the Jaro-Winkler metric with parameters of your own choosing.
        ...