pybpmn-parser
Starter code for using the hdBPMN dataset for diagram recognition research.
The dump_coco.py script can be used to convert the images and BPMN XMLs into a COCO dataset. COCO is a common format used in computer vision research to annotate the objects and keypoints in images.
python scripts/dump_coco.py path/to/hdBPMN path/to/target/coco/directory/hdbpmn
Moreover, the demo.ipynb Jupyter notebook can be used to visualize (1) the extracted bounding boxes, keypoints, and relations, and (2) the annotated BPMN diagram overlayed over the hand-drawn image. Note that the latter requires the bpmn-to-image tool, which in turn requires a nodejs installation.
Installation
pip install pybpmn-parser
Development
In order to set up the necessary environment:
- create an environment
pybpmn-parser
with the help of conda:conda env create -f environment.yml
- activate the new environment with:
conda activate pybpmn-parser
NOTE: The conda environment will have pybpmn-parser installed in editable mode. Some changes, e.g. in
setup.cfg
, might require you to runpip install -e .
again.
Optional and needed only once after git clone
:
-
install JupyterLab kernel
python -m ipykernel install --user --name "${CONDA_DEFAULT_ENV}" --display-name "$(python -V) (${CONDA_DEFAULT_ENV})"
-
install several pre-commit git hooks with:
pre-commit install # You might also want to run `pre-commit autoupdate`
and checkout the configuration under
.pre-commit-config.yaml
. The-n, --no-verify
flag ofgit commit
can be used to deactivate pre-commit hooks temporarily.
Project Organization
βββ LICENSE.txt <- License as chosen on the command-line.
βββ README.md <- The top-level README for developers.
βββ data
β βββ external <- Data from third party sources.
β βββ interim <- Intermediate data that has been transformed.
β βββ processed <- The final, canonical data sets for modeling.
β βββ raw <- The original, immutable data dump.
βββ docs <- Directory for Sphinx documentation in rst or md.
βββ environment.yml <- The conda environment file for reproducibility.
βββ notebooks <- Jupyter notebooks. Naming convention is a number (for
β ordering), the creator's initials and a description,
β e.g. `1.0-fw-initial-data-exploration`.
βββ pyproject.toml <- Build system configuration. Do not change!
βββ scripts <- Analysis and production scripts which import the
β actual Python package, e.g. train_model.py.
βββ setup.cfg <- Declarative configuration of your project.
βββ setup.py <- Use `pip install -e .` to install for development or
β or create a distribution with `tox -e build`.
βββ src
β βββ pybpmn <- Actual Python package where the main functionality goes.
βββ tests <- Unit tests which can be run with `py.test`.
βββ .coveragerc <- Configuration for coverage reports of unit tests.
βββ .isort.cfg <- Configuration for git hook that sorts imports.
βββ .pre-commit-config.yaml <- Configuration of pre-commit git hooks.
Note
This project has been set up using PyScaffold 4.0.1 and the dsproject extension 0.6.1.