multilink

Multifile Record Linkage and Duplicate Detection


License
GPL-3.0

Documentation

multilink

multilink is an R package which implements methodology presented in the manuscript “Multifile Partitioning for Record Linkage and Duplicate Detection” by Serge Aleshin-Guendel and Mauricio Sadinle, published in the Journal of the American Statistical Association and available on arXiv. It handles the general problem of multifile record linkage and duplicate detection, where any number of files are to be linked, and any of the files may have duplicates.

Installation

You can install the released version of multilink from GitHub with:

install.packages("devtools")
devtools::install_github("aleshing/multilink")