sparsebn

Learning Sparse Bayesian Networks from High-Dimensional Data


Keywords
bayesian-networks, covariance-matrices, experimental-data, graphical-models, machine-learning, r, regularization, statistics
Licenses
CNRI-Python-GPL-Compatible/CNRI-Python-GPL-Compatible

Documentation

sparsebn

Project Status: Active The project has reached a stable, usable state and is being actively developed. Travis-CI Build Status CRAN RStudio mirror downloads

Introducing sparsebn: A new R package for learning sparse Bayesian networks and other graphical models from high-dimensional data via sparse regularization. Designed from the ground up to handle:

  • Experimental data with interventions
  • Mixed observational / experimental data
  • High-dimensional data with p >> n
  • Datasets with thousands of variables (tested up to p=8000)
  • Continuous and discrete data

The emphasis of this package is scalability and statistical consistency on high-dimensional datasets. Compared to existing algorithms, sparsebn scales much better and is under active development. For more details on this package, including worked examples and the methodological background, please see our new preprint [1].

Overview

The main methods for learning graphical models are:

  • estimate.dag for directed acyclic graphs (Bayesian networks).
  • estimate.precision for undirected graphs (Markov random fields).
  • estimate.covariance for covariance matrices.

Currently, estimation of precision and covariances matrices is limited to Gaussian data.

The workhorse behind sparsebn is the sparsebnUtils package, which provides various S3 classes and methods for representing and manipulating graphs. The basic algorithms are implemented in ccdrAlgorithm and discretecdAlgorithm.

Installation

You can install:

  • the latest CRAN version with

    install.packages("sparsebn")
  • the latest development version from GitHub with

    devtools::install_github(c("itsrainingdata/sparsebn/", "itsrainingdata/sparsebnUtils/dev", "itsrainingdata/ccdrAlgorithm/dev", "gujyjean/discretecdAlgorithm"))

References

[1] Aragam, B., Gu, J., and Zhou, Q. (2017). Learning large-scale Bayesian networks with the sparsebn package. arXiv: 1703.04025.

[2] Aragam, B. and Zhou, Q. (2015). Concave penalized estimation of sparse Gaussian Bayesian networks. The Journal of Machine Learning Research. 16(Nov):2273−2328.

[3] Fu, F., Gu, J., and Zhou, Q. (2014). Adaptive penalized estimation of directed acyclic graphs from categorical data. arXiv: 1403.2310.

[4] Aragam, B., Amini, A. A., and Zhou, Q. (2015). Learning directed acyclic graphs with penalized neighbourhood regression. arXiv: 1511.08963.

[5] Fu, F. and Zhou, Q. (2013). Learning sparse causal Gaussian networks with experimental intervention: Regularization and coordinate descent. Journal of the American Statistical Association, 108: 288-300.