Rooster

A simple, fast Hash-key data storage system for Python applications


Keywords
hashmap, data, storing, database
License
Other
Install
pip install Rooster==1.0.0

Documentation

Rooster

Rooster

Rooster is a Python Module that allows you to easily and quickly store data, in the form of key-value pairs in a Hash-bucket storage unit. It's perfect for storing time stamps, usernames, passwords, data values, and more. Each key-value pair is stored in a seperate file, so Rooster can carry much higher loads than SQLite, or other single file database systems.

Installation

To install Rooster, in your terminal type:

pip install Rooster

PyPi Link

The Hashing Algorithm

Rooster uses a custom made hashing algorithm, which uses a 16 digit length alpha numeric expression to store keys in, as opposed to just a 16-bit or 128-bit number that traditional hashing algorithms use.

First, Rooster takes keys as strings with only alphanumeric digits or spaces. Then, it converts the key in string from to an integer, through matching each string character to a specific integer, in a non-typical fashion. Here is a sample flowchart:

"love" -> l -> 2 ->
          o -> 4  -> 419232
          v -> 19 ->
          e -> 23 ->

Once the integer is created, the integer is placed into a different function, that takes the integer and maps it to multiple dublets or triplets of alphanumeric characters on a list. This list, however, is special because it has it's __getitem__ method modified to accomodate integers infinitely higher than it's max index of elements. This is done so that anytime an index operation on the infinite list is greater than the max index, its redced by it's remainder to fit into the list. Here is another flow chart:

infinite list = ['a3', 'rt', 'zq', 'yg', '8i']

11 -> 11 % 5 = 1 -> infinitelist[1] = 'rt'

11 *= 8

88 -> 88 % 5 = 3 -> infinitelist[3] = 'yg'

hash_seq = 'rtyg'

The idea behind using doublets and triplets of alpha numeric characters is that it creates more spread between patterns. In addition, the idea that the hash function differntiates with each integer makes a collision near impossible.

do_hash(string):

Hashs a string of alpha numeric characters or spaces into a 16 digit length hash key. Will raise an error if invalid characters are present in the string.

Example:

>>> do_hash('66')
'b5hl3l4kbkfkvkmk'

Data Storage

Rooster stores key value pairs by taking the key, producing a hash value of the key, and creating a file with that hash value and the original value linked to the key stored inside the file. It does so in a directory that must be a subdirectory of the current working directory of the script. You can create it yourself or use this command:

def createrooster(directory):
    if not os.path.exists(directory):
        os.makedirs(directory)
    else:
        raise FileExistsError("directory name already exists")

Rooster can store data in multiple different directories, you just specify which directory you want to access.

Setting Data

To set a key-value pair with Rooster, or to save data in general, you have quite a few options. The first, and most basic:

set_rooster(key, value, directory):

set_rooster takes a key in string format, value in any format, and a name of a sub directory, in which it saves that value under a file of the hash_value of the key, with a .rooster extension. This function will overwrite any existing file if it has the same key. If the directory does not exist, it returns a FileNotFoundError, and returns a ValueError if the key contains unsupported characters.

safe_set(key, value, directory):

safe_set provides the same functionality as set_rooster, but it only sets the key-value pair if the key does not already exist. If this is used with a key that already exists in the directory, it returns "key already exists".

set_dict(dict, directory):

Takes a Python dictionary, and for every key-value pairing in the dictionary, calls set_rooster on them with the specified directory.

safeset_dict(dict, directory):

Takes a Python dictionary object, and for every key-value pairing in the dictionary, calls safe_set on them with the specified directory. If some of the keys in the dictionary are already hashed into the directory, it will not write over those keys but will write the others.

del_key(key, directory):

Deletes a .rooster key file in the specified directory.

Getting Data

In Rooster, you can retrieve data in two ways. You can either get only the string version of the value you saved in your directory, or you can have it parsed back into a python object. Rooster supports String to Object conversion for integers, lists, sets, tuples, and dictionaries, as well as more objects that have a __str__ method.

check_key(key, directory):

Checks if a key already exists in a specified directory. Returns True if it does, otherwise returns False.

get_rooster(key, directory):

Returns the value set for a hashed key in the specified directory. If the key or directory does not exist, it will return "key or directory does not exist". If the key contains invalid characters, it will return "key contains invalid characters, please use only alpha numeric or spaces". Parses the opened string back into a python data structure, such as lists, if the string matches a pattern. Otherwise, returns the string.

get_str_rooster(key, directory):

Provides the same functionality as get_rooster, but returns the str form of the value stored in the .rooster file in the specified directory. Returns the same error messages as get_rooster

Parsing Data

parse_data(string):

Takes a string, and attempts to match it with various regular expressions to determine if it represents a python data structure. If one of them matches, it calls another function, to_obj, which uses the exec() and locals() builtins to convert it quickly back into object form.