atomic-store

A robust, atomic single-file value store


Keywords
atomic, store
License
MIT
Install
pip install atomic-store==0.0.1

Documentation

atomic_store

Easier than a DBMS, but more fault-resistant than just a file.

Sometimes you need to manage a bit of state across executions. Sometimes, a fully-blown database is just too much.

This library makes it easy to keep a store of stuff in a JSON file, in an atomic and fault-resistant manner.

Other formats (like pickle and bson) are also supported, and arbitrary formats are possible.

Table of Contents

Install

Just pip install atomic_store. Or, if you must, pip install -r requirements.txt

Note that the only dependency is atomicwrites, which has no dependencies.

Usage

By default, the store is encoded as json, written to a temporary file, and then atomically replaces the old file. When reading, if the file does not exist, a default value is used. The default default value is None.

Context Manager

This program remembers all start times:

import atomic_store
import time

with atomic_store.open('runs.json', default=[]) as store:
    print('Previous executions:')
    print(store.value)
    new_entry = time.strftime('%Y-%m-%d %H:%M:%S%z')
    store.value.append(new_entry)

Leaving the context manager takes care of all writes. No intermediate values get written to disk.

This is ideal if the task runs short, and in case of any error you only want to keep the old state anyway.

For advanced uses, also see the subsection on reentrancy.

Manual control

This program remembers all start times:

import atomic_store

my_store = atomic_store.open('gathered.json', default=dict())

my_store.value['state'] = 'running'
my_store.value['thought'] = 'I would not eat green eggs and ham.'
my_store.commit()
# ... some calculations ...
my_store.value['state'] = 'done'
my_store.value['thought'] = 'I do so like Green eggs and ham!'
my_store.commit()

Only calls to commit() cause writes to the disk. Again, no intermediate values get written to disk.

This is ideal if you have a long-running job with clear steps, and each step's output is valuable.

Note that commit() is also available in the context manager.

Format tweaks

If you're using the json backend, and want to keep the JSON file as small as possible, you can call open with dump_kwargs=dict(separators=(',', ':')). The keyword load_kwargs also exists.

Non-JSON formats

You can use arbitrary other formats, using the format keyword:

atomic_store.open('runs.json', default=[], format=MY_FORMAT)

Supported values are None (for JSON), 'json', 'pickle', 'bson' (requires bson to be installed), and also any module or object providing dump/load or dumps/loads. By default, atomic_store assumes you operate on binary files, except when JSON is involved. To override this, you can set is_binary. Note that this means you can use the modules json, pickle, and bson as they are.

For convenience, you can also override the abstract classes atomic_store.AbstractFormatFile or atomic_store.AbstractFormatBstr.

In all cases, load_kwargs and dump_kwargs are still supported.

Reentrancy

If the same atomic_store is used as a context manager more than once, the default behavior is to write the file only when the last with is exited:

# Assume `state.json` contains only `"before"`.
mngr = atomic_store.open('mystate.json', default=[])
with mngr as store:
    store.value = 'outer'
    # File contains `"before"`: We haven't exited any context manager yet.
    with mngr as store:
        store.value = 'inner'
        # File contains `"before"`: We haven't exited any context manager yet.
    # File now contains `"inner"`, because the inner `with`-statement wrote it.
    # Read the Reentrancy section if you consider this undesired behavior.
# File now contains `"inner"`, because the outer `with`-statement wrote it again.

If you consider this behavior undesirable, you can either just use multiple context managers (by calling atomic_store.open multiple times), or by using the keyword ignore_inner_exits=True, like this:

# Assume `state.json` contains only `"before"`.
mngr = atomic_store.open('mystate.json', default=[], ignore_inner_exits=True)
with mngr as store:
    store.value = 'outer'
    # File contains `"before"`: We haven't exited any context manager yet.
    with mngr as store:
        store.value = 'inner'
        # File contains `"before"`: We haven't exited any context manager yet.
    # File *still* contains `"before"`, as the manager detected that it is still active.
# File now contains `"outer"`, because the outer `with`-statement wrote it.

Atomic is not magic

This library is not magical.

If two threads (or two processes, or whatever) open a store, modify something, and then write concurrently, one of the results may be lost. However, the writes are guaranteed to be atomic, so the data is merely lost, but not corrupted.

TODOs

  • Figure out how to make bson optional
  • Publish on PyPI

NOTDOs

Here are some things this project will not support:

  • Any DB backend.
  • Any multi-file backend.
  • More advanced semantics than just commit.
  • This includes rollback. It's just not obvious which behavior is desired when the file does not exist (Re-use default value? What if it was modified, as it happens with lists and dicts?), and with stacked context managers (should it rollback to the file's state? Or to the beginning of the with?)

Contribute

Feel free to dive in! Open an issue or submit PRs.