pywhip

Python package to validate data against whip specifications


Keywords
pywhip, whip, Darwin_Core_Archive, data, validation, data-validation, lifewatch, oscibio, python
License
MIT
Install
pip install pywhip==0.3.4

Documentation

pywhip

Build Status Build Status Updates

The pywhip package is a Python package to validate data against whip specifications, a human and machine-readable syntax to express specifications for data.

Check the documentation pages for more information.

Installation

To install pywhip, run this command in your terminal:

pip install pywhip

For more detailed installation instructions, see the documentation pages.

Test pywhip in jupyter notebook

Launch a jupyter notebook to interactively try out the pywhip package:

Binder

Quickstart

To validate a CSV data file with the field headers country, eventDate and individualCount, write whip specifications, according to the whip syntax:

specifications = """
    country:
       allowed: [BE, NL]
    eventDate:
        dateformat: '%Y-%m-%d'
        mindate: 2016-01-01
        maxdate: 2018-12-31
    individualCount:
        numberformat: x  # needs to be an integer value
        min: 1
        max: 100
    """

To whip your data set, e.g. my_data.csv, pass the data to whip specifications:

from pywhip import whip_csv

example = whip_csv("my_data.csv", specifications, delimiter=',')

and write the output report to an html file:

with open("report_example.html", "w") as index_page:
    index_page.write(example.get_report('html'))

Resulting in a report like this. For a more detailed introduction, see the documentaton tutorial.

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

Validation of data rows is using the Cerberus package.