miracle-imputation

Missing data Imputation Refinement And Causal LEarning


License
BSD-3-Clause
Install
pip install miracle-imputation==0.1.6

Documentation

MIRACLE (Missing data Imputation Refinement And Causal LEarning)

Tests License

Code Author: Trent Kyono

This repository contains the code used for the "MIRACLE: Causally-Aware Imputation via Learning Missing Data Mechanisms" paper(2021).

Installation

pip install -r requirements.txt
pip install .

Tests

You can run the tests using

pip install -r requirements_dev.txt
pip install .
pytest -vsx

Contents

  • miracle/MIRACLE.py - Imputer/Refiner Class. This class takes a baseline imputation and returns a refined imputation. This code has been forked from [2].
  • miracle/third_party - Reference imputers: Mean, Missforest, MICE, GAIN, Sinkhorn, KNN.
  • tests/run_example.py - runs a nonlinear toy DAG example. Uses mean imputation as a baseline and applies MIRACLE to refine.

Examples

Base example on toy dag.

$ cd tests/
$ python run_example.py

This specific instantiation returns a Baseline RMSE of approximately 0.95 with MIRACLE RMSE of approximately 0.40.

An example to run toy example with a dataset size of 2000 for 300 max_steps with a missingness of 30%

$ python3 run_example.py --dataset_sz 2000 --max_steps 300 --missingness 0.3

Citing

@inproceedings{kyono2021miracle,
	title        = {MIRACLE: Causally-Aware Imputation via Learning Missing Data Mechanisms},
	author       = {Kyono, Trent and Zhang, Yao and Bellot, Alexis and van der Schaar, Mihaela},
	year         = 2021,
	booktitle    = {Conference on Neural Information Processing Systems(NeurIPS) 2021}
}

References

[1] Jinsung Yoon, James Jordon, and Mihaela van der Schaar. Gain: Missing data imputation using generative adversarial nets. In ICML, 2018.

[2] Trent Kyono, Yao Zhang, and Mihaela van der Schaar. CASTLE: Regularization via auxiliary causal graph discovery. In NeurIPS, 2020.

[3] Zheng, X., Aragam, B., Ravikumar, P., & Xing, E. P. (2018). DAGs with NO TEARS: Continuous optimization for structure learning (NeurIPS 2018).

[4] Zheng, X., Dan, C., Aragam, B., Ravikumar, P., & Xing, E. P. (2020). Learning sparse nonparametric DAGs (AISTATS 2020).