A library that simplifies some basic ML stuff.


Keywords
ML, data, analysis
License
MIT
Install
pip install wuml==0.145

Documentation

wuml

Chieh's quick ML library

Pip Installation

pip install wuml

Examples Usages

Manipulation of wData type

Manipulation of wData type

Data Statistics

Learn about missing data stats
Feature wise Correlation Matrices
Feature wise HSIC Matrices

Measures

Norm

Dependency Measures

Comparing HSIC to Correlation
Approximate HSIC with fewer samples
Calculate Precision or Recall between labels

IO

jupyter_print
Easy Create/Print Table

Data Preprocessing

Obtain sample weight based on label likelihood
Show histogram of a feature
Basic Split data into Training Test, or with validation too
Split data into Training Test + Look at the histogram of their labels
Split data into Training Test + Run a basic Neural Network

Map data into between 0 and 1
Normalize each row to l1=1 or l2=1

Load data + Decimate rows and column with too much missing + auto-imputation
Load data + center/scaled or between 0 and 1
With 10 Fold Cross Validation
Get data subset with N samples from each Class

Build Neural Networks via Pytorch

Simple Regression with/without Batch Normalization + saving the network
Loading a saved and trained network for usage
Weighted Regression
Using HSIC as an objective with batch samples \
Simple Classification
Basic Autoencoder Classification
Basic Autoencoder Regression\
Complex mixture of Networks/Objectives

Distance Between Distributions

Wasserstein Distance Example
MMD Distance Example

Distribution Modeling

KDE Example
Maximum Likelihood on Exponential Distribution Example
Using Flow-based Deep Generative Model
Using Flow to get P(X)

Feature Selection

Unsupervised Filtering via HSIC

Explaining Models

Run basic Shap/lime explainer (Regression/Classification)
Run basic Shap/lime explainer on basic network
Run basic Shap/lime explainer on autoencoder network
Run basic Shap/lime explainer on complex network
After saving Network with Explainer, here we load it

Regression / Classification

Run Several Basic Regressors
Interpret feature importance for linear Regressors
Run Several Basic Classifiers
Use bagging with 10 fold Classifiers

Dimension Reduction

Run Several Dimension Reduction Examples

Clustering

Run Several Clustering Examples

Math Operations

EigenDecomposition
Integrate a univariate function

Feature Map Approximation

RFF and SORF

Rebalance Skew classification data

Rebalance skewed data with oversampling and smote

Repeat Run of algorithm

Run simple k-fold cross validation
Run complex 10-fold
Repeat Experiments on Different Settings