Categorical-similarity-measures

Similarity Measures Utility Package


Keywords
Similarity, Distance, Categorical, data
License
MIT
Install
pip install Categorical-similarity-measures==0.4

Documentation

Determining similarity or distance between two objects is a key step for several data mining and knowledge discovery tasks. For quantitative data, Minkowski distance plays a major role in finding the distance between two entities. The prevalently known and used similarity measures are Manhattan distance which is the Minkowski distance of order 1 and the Euclidean distance which is the Minkowski distance of order 2. But, in the case of categorical data, we know that there does not exist an innate order and that makes it problematic to find the distance between two categorical points.

This is a utility package for finding similarity measures such as Eskin, IOF, OF, Overlap (Simple Matching), Goodall1, Goodall2, Goodall3, Goodall4, Lin, Lin1, Morlini_Zani (S2), Variable Entropy and Variable Mutability. These similarity measures help in finding the distance between two or more objects or entities containing categorical data.