hyperloglogplus

HyperLogLog++ with MinHash for efficient cardinality and intersection estimation using constant space. See original AdRoll paper for details: http://tech.adroll.com/media/hllminhash.pdf


Keywords
bsd3, library, test, Data.HyperLogLogPlus, Data.HyperLogLogPlus.Config, Data.HyperLogLogPlus.Type, HyperLogLog and MinHash
License
BSD-3-Clause
Install
cabal install hyperloglogplus-0.1.0.0

Documentation

HyperLogLogPlus

Hackage Build Status

Haskell implementation of HyperLogLog++ with MinHash for efficient cardinality and intersection estimation using constant space.

See original AdRoll paper for details: HyperLogLog and MinHash

Also AdRoll blog post

-- Example:
:set -XDataKinds
:load Data.HyperLogLogPlus

type HLL = HyperLogLogPlus 12 8192

mempty :: HLL

size (foldr insert mempty [1 .. 75000] :: HLL)

size $ (foldr insert mempty [1 .. 5000] ::  HLL) <> (foldr insert mempty [3000 .. 10000] :: HLL)

intersection $ [ (foldr insert mempty [1 .. 15000] ::  HLL)
               , (foldr insert mempty [12000 .. 20000] :: HLL) ]

Testing

stack test