This is an ever-growing package of core tools for use on client projects by Oreum Industries.
- Provides an essential workflow for data curation, EDA, basic ML using the core
scientific Python stack incl.
numpy
,scipy
,matplotlib
,seaborn
,pandas
,scikit-learn
,umap-learn
- Optionally provides an advanced Bayesian modelling workflow in R&D and
Production using a leading probabilistic programming stack incl.
pymc
,pytensor
,arviz
(dopip install oreum_core[pymc]
) - Optionally enables a generalist black-box ML workflow in R&D using a leading
Gradient Boosted Trees stack incl.
catboost
,xgboost
,optuna
,shap
(dopip install oreum_core[tree]
) - Also includes several utilities for text cleaning, sql scripting, file handling
This package is:
- A work in progress (v0.y.z) and liable to breaking changes and inconvenience to the user
- Solely designed for ease of use and rapid development by employees of Oreum Industries, and selected clients with guidance
This package is not:
- Intended for public usage and will not be supported for public usage
- Intended for contributions by anyone not an employee of Oreum Industries, and unsolicitied contributions will not be accepted.
- Project began on 2021-01-01
- The
README.md
is MacOS and POSIX oriented - See
LICENCE.md
for licensing and copyright details - See
pyproject.toml
for various pacakge details - This uses a logger named
'oreum_core'
, feel free to incorporate or ignore - Hosting:
For local development on MacOS
- Install Homebrew, see instuctions at https://brew.sh
- Install
direnv
,git
,git-lfs
,graphviz
,zsh
$> brew update && upgrade
$> brew install direnv git git-lfs graphviz zsh
Assumes direnv
, git
, git-lfs
and zsh
installed as above
$> git clone https://github.com/oreum-industries/oreum_core
$> cd oreum_core
Then allow direnv
on MacOS to autorun file .envrc
upon directory open
Notes:
- We use
conda
virtual envs controlled bymamba
(quicker thanconda
) - We install packages using
miniforge
(sourced from theconda-forge
repo) wherever possible and only usepip
for packages that are handled better bypip
and/or more up-to-date on pypi - Packages might not be the very latest because we want stability for
pymc
which is usually in a state of development flux - See cheat sheet of conda commands
- The
Makefile
creates a dev env and will also download and preinstallminiforge
if not yet installed on your system
From the dir above oreum_core/
project dir:
$> make -C oreum_core/ dev
This will also create some files to help confirm / diagnose successful installation:
-
dev/install_log/blas_info.txt
for theBLAS MKL
installation fornumpy
-
dev/install_log/pipdeptree[_rev].txt
lists installed package deps (and reversed) -
LICENSES_THIRD_PARTY.md
details the license for each package used
From the dir above oreum_core/
project dir:
$> make -C oreum_core/ test-dev-env
This will also add files dev/install_log/[numpy|scipy].txt
which detail
successful installation (or not) for numpy
, scipy
From the dir above oreum_core/
project dir:
$> make -C oreum_core/ uninstall-env
We use pre-commit to run a suite of automated tests for code linting & quality control and repo control prior to commit on local development machines.
- Precommit is already installed by the
make dev
command (which itself callspip install -e .[dev]
) - The pre-commit script will then run on your system upon
git commit
- See this project's
.pre-commit-config.yaml
for details
We use Github Actions aka Github Workflows to run:
- A suite of automated tests for commits received at the origin (i.e. GitHub)
- Publishing to PyPi upon creating a GH Release
- See
Makefile
for the CLI commands that are issued - See
.github/workflows/*
for workflow details
Copyright 2024 Oreum OÜ t/a Oreum Industries. All rights reserved. See LICENSE.md.
Oreum OÜ t/a Oreum Industries, Sepapaja 6, Tallinn, 15551, Estonia, reg.16122291, oreum.io
Oreum OÜ © 2024