data-records

Immutable Data Records with Type Coercion


License
MIT
Install
pip install data-records==0.4.0

Documentation

Data Records

PyPI version pipeline status coverage report PyPI Documentation Status Downloads

In certain Functional languages there is a concept of Records. They are a Product Data Type of immutable data that has typed attributes.

Goals

The following are the goals and the "north star" for design during the development of this project:

  • Ease Of Use
    • Simple Interface
    • Does the obvious thing in most cases
  • Immutability
    • Follow Immutability Patterns such as replace and pattern matching
  • Safety
    • Include Type Coercion Where Possible
    • Guarantee that a record has the resulting types
    • Throw Warning when it is implemented Incorrectly

Motivation

Enforced Typing

I love @dataclass, and was ecstatic when it was added to python. However certain things like:

>>> from dataclasses import dataclass, field

>>> @dataclass
... class Foo:
...     bar: str
...     baz: int

>>> Foo(1, 2)
Foo(bar=1, baz=2)

is not what I would expect when coming from other typed languages. In statically typed languages, this should throw an error because bar should be a string. In languages with type coercion, I would expect that bar would be "1". The default behavior of dataclasses here does neither, and if I were to use this dataclass somewhere that expected bar to be a string it would fail with a runtime exception; exactly what the types were supposed to help prevent.

>>> from data_records import datarecord

>>> @datarecord
... class Foo:
...     bar: str
...     baz: int

>>> Foo(1, 2)
Foo(bar='1', baz=2)

>>> Foo("a", "b")
Traceback (most recent call last):
 ...
ValueError: invalid literal for int() with base 10: 'b'

Extraneous Field Handling

Another Problem with dataclasses occurs when trying to pass in a dictionary that has more keys than are required for creating a dataclass:

>>> from dataclasses import dataclass

>>> @dataclass
... class Foo:
...     bar: str
...     baz: int

>>> Foo(**{'bar': 'test', 'baz': 1, 'other': 'nothing'})
Traceback (most recent call last):
 ...
TypeError: __init__() got an unexpected keyword argument 'other'

This makes it hard to pull data records out of larger database calls or received data.

>>> from data_records import datarecord 

>>> @datarecord
... class Foo:
...     bar: str
...     baz: int

>>> Foo(**{'bar': 'test', 'baz': 1, 'other': 'test'})
Foo(bar='test', baz=1)

>>> Foo.from_dict({'bar': 'test', 'baz': 1, 'other': 'test'})
Foo(bar='test', baz=1)

Immutable Handling

Data records are immutable (much like frozen dataclasses) and the handling for such is builtin:

>>> from data_records import datarecord

>>> @datarecord
... class Foo:
...     bar: str
...     baz: int
...     lat: float
...     long: float

>>> example = Foo('test', 2, 65.1, -127.5)
>>> example2 = example.replace(bar='testing')

>>> example
Foo(bar='test', baz=2, lat=65.1, long=-127.5)

>>> example2
Foo(bar='testing', baz=2, lat=65.1, long=-127.5)

>>> latitude, longitude = example.extract('lat', 'long')
>>> latitude
65.1