pyshmht

provide shared memory based hash table for python


Keywords
python, extension, sharing, memory, based, hash, table
License
BSD-3-Clause
Install
pip install pyshmht==0.0.2

Documentation

pyshmht

Sharing memory based Hash Table extension for Python

BE CAREFUL: this package is not for general purpose usage, it only accepts key < max_key_size and values < max_value_size. And although it's sharing memory based, it DO NOT use locks to avoid concurrency problem. It was designed for a former project, which had a write process, and many read process after the writer finished.

For examples, see test cases in python files (pyshmht/Cacher.py, pyshmht/HashTable.py), where you can find performance tests as well.

Performance

capacity=200M, 64 bytes key/value tests, tested on (Xeon E5-2670 0 @ 2.60GHz, 128GB ram)

  • hashtable.c (raw hash table in c, tested on malloced memory)

set: 0.93 Million iops;
get: 2.35 Million iops;

  • performance_test.py (raw python binding)

set: 451k iops;
get: 272k iops;

  • HashTable.py (simple wrapper, no serialization)

set: 354k iops;
get: 202k iops;

  • Cacher.py (cached wrapper, with serialization)

set: 501k iops (cached), 228k iops (after write_back);
get: 560k iops (cached), 238k iops (no cache);

  • python native dict

set: 741k iops;
get: 390k iops;

Notice

In hashtable.c, default max key length is 256 - 4, max value length is 1024 - 4; you can change bucket_size and max_key_size manually, but bear in mind that increasing these two arguments will result in larger memory consumption.

If you find any bugs, please submit an issue or send me a pull request, I'll see to it ASAP :)

p.s. hashtable.c is independent (i.e. has nothing to do with python), you can use it in other projects if needed. :P