Securetypes' securedict is an implementation of dict that protects against
algorithmic complexity attacks. With a normal dict, if an adversary can
control the keys inserted into the dict, he can slow the program to a halt
by picking keys with colliding hash()es. Here are some sample colliding
hash()es:
>>> hash(2**64), hash(2*2**64-1), hash(3*2**64-2), hash(4*2**64-3)
(1, 1, 1, 1)
>>> hash("\x00"), hash("\x00\x03"), hash("\x00\x00\x02"), hash("\x00\x00\x00\x05")
(1, 1, 1, 1)
Better string-hash collisions can be found by analyzing Python's hash function.
To protect against algorithmic complexity attacks, internally securedict
stores a key wrapper for each key:
("_securedictmarker", sha1(key + secret), key)sha1 protects against collisions, and secret prevents adversaries from
controlling the hash() of the key wrapper. (Without secret, you could just
pre-compute hash(sha1(key)) for 2**32 keys.)
securedict implements most of the dict API; you can use it much like a dict:
from securetypes import securedict
d = securedict(x=3)
d['y'] = 4
print d # prints securedict({'y': 4, 'x': 3})
# Special features:
print d.repr_like_dict() # prints {'y': 4, 'x': 3}For more information about algorithmic complexity attacks, see:
-
http://www.cs.rice.edu/~scrosby/hash/CrosbyWallach_UsenixSec2003/
-
http://mail.python.org/pipermail/python-dev/2003-May/035874.html
-
http://mail.python.org/pipermail/python-dev/2011-December/115116.html
-
A
securedictsupports only these types for keys:str,unicode,int,long,float,bool, andNoneType. Future versions will supporttupleand any object with a__securehash__. -
A
securedictis even less thread-safe than adict. Don't use the samesecuredictobject in multiple threads. Doing this may result in strange exceptions. -
A
securedictis==to a normaldict(if the contents are the same). -
securedictis a subclass ofdict. -
In CPython,
dict()ing asecuredictgives you garbage (a dictionary containing key wrappers instead of the keys). In pypy it works, though you still should neverdict()asecuredictbecause it defeats the purpose ofsecuredict. -
.copy()returns asecuredict. -
.popitem()may pop a different item than an equal dict would; see the unit tests. -
A
securedict's<and>compares thesecuredict's id instead of using Python's complicated algorithm. This may change in the future to work like Python's algorithm (see CPythondictobject.c:dict_compare). Don't rely on the current "compares id" behavior. -
In Python 2.7+, calling
.viewitems()or.viewkeys()raisesNotImplementedError, while.viewvalues()works as usual. -
sys.setdefaultencodingmay affect asecuredictdifferently than it affectsdict. (No one should ever usesetdefaultencoding, but pygtk does.)
Again: never dict() a securedict.
Don't use nans as dictionary keys. securedict can't help you here.
All nans have the same hash() and are not equal to any object.
CPython 2.4+ or pypy (tested 1.4 and 1.5)
python setup.py install
This installs the modules securetypes and test_securetypes.
Securetypes is also available on PyPI http://pypi.python.org/pypi/Securetypes and you can install it with pip:
pip install Securetypes
Install Twisted, then run trial test_securetypes
securedict is very useful when decoding JSON objects, because the objects
could contain many hash()-colliding keys. You can tell json/simplejson
to create securedicts instead of dicts:
try:
import simplejson as json
except ImportError:
import json
from securetypes import securedict
dec = json.decoder.JSONDecoder(object_pairs_hook=securedict)
print dec.decode('{"b": ["bee", {}]}')
# prints securedict({'b': ['bee', securedict({})]})