github.com/metasv/bloom

Package bloom provides data structures and methods for creating Bloom filters. A Bloom filter is a representation of a set of _n_ items, where the main requirement is to make membership queries; _i.e._, whether an item is a member of a set. A Bloom filter has two parameters: _m_, a maximum size (typically a reasonably large multiple of the cardinality of the set to represent) and _k_, the number of hashing functions on elements of the set. (The actual hashing functions are important, too, but this is not a parameter for this implementation). A Bloom filter is backed by a BitSet; a key is represented in the filter by setting the bits at each value of the hashing functions (modulo _m_). Set membership is done by _testing_ whether the bits at each value of the hashing functions (again, modulo _m_) are set. If so, the item is in the set. If the item is actually in the set, a Bloom filter will never fail (the true positive rate is 1.0); but it is susceptible to false positives. The art is to choose _k_ and _m_ correctly. In this implementation, the hashing functions used is murmurhash, a non-cryptographic hashing function. This implementation accepts keys for setting as testing as []byte. Thus, to add a string item, "Love": Similarly, to test if "Love" is in bloom: For numeric data, I recommend that you look into the binary/encoding library. But, for example, to add a uint32 to the filter: Finally, there is a method to estimate the false positive rate of a particular Bloom filter for a set of size _n_: Given the particular hashing scheme, it's best to be empirical about this. Note that estimating the FP rate will clear the Bloom filter.


License
BSD-2-Clause
Install
go get github.com/metasv/bloom