CLK Hash
clkhash is a Python implementation of cryptographic linkage key hashing as described by Rainer Schnell, Tobias Bachteler, and Jörg Reiher in A Novel Error-Tolerant Anonymous Linking Code.
Installation
Install clkhash with all dependencies using pip:
pip install clkhash
Documentation
https://clkhash.readthedocs.io
Python API
To hash a CSV file of entities using the default schema:
from clkhash import clk, randomnames
fake_pii_schema = randomnames.NameList.SCHEMA
clks = clk.generate_clk_from_csv(open('fake-pii-out.csv','r'), 'secret', fake_pii_schema)
Command Line Interface
See Anonlink Client for a command line interface to clkhash.
Citing
Clkhash, and the wider Anonlink project is designed, developed and supported by CSIRO's Data61. If you use any part of this library in your research, please cite it using the following BibTex entry::
@misc{Anonlink,
author = {CSIRO's Data61},
title = {Anonlink Private Record Linkage System},
year = {2017},
publisher = {GitHub},
journal = {GitHub Repository},
howpublished = {\url{https://github.com/data61/clkhash}},
}