sscu-budapest utilities for scientific data engineering


Keywords
research-software
License
MIT
Install
pip install sscutils==0.3.3

Documentation

sscutils

Documentation Status codeclimate codecov pypi

Some utility function to help with

  • setting up data subsets with invoke
  • simplified dvc pipeline registry

these are used in dataset-template and research-project-template

Make sure that python points to python>=3.8

Lookahead

  • overlapping names convention
  • resolve naming confusion with colassigner, colaccessor and table feature / composite type / index base classes
  • abstract composite type + subclass of entity class
    • import ACT, inherit from it and specify
    • importing composite type is impossible now if it contains foreign key :(
  • automatic filter for env creation based on foreign key metadata
  • add option to infer data type of assigned feature
    • can be problematic b/c pandas int/float/nan issue
  • metadata created dry, dynamically, but imported static, wet
  • sharing functions among projects
    • functions specific to processing certain composite / named types
    • e.g. function dealing with fitting into a limit in dogshow project 1
  • detecting reliance of composite type given by assigner
    • can wait, as initial import is just the assigner transformed to accessor
  • overlapping in entities
    • detect / signal the same type of entity
  • properly assert importing