matrixTests

Fast Statistical Hypothesis Tests on Rows and Columns of Matrices


Keywords
anova, fast, hypothesis-testing, matrix, package, r, rows, t-test, wilcoxon-test
License
GPL-2.0

Documentation

CRAN version Build Status codecov dependencies Monthly Downloads

Matrix Tests

A package dedicated to running multiple statistical hypothesis tests on rows and columns of matrices.

illustration

Goals

  1. Fast execution via vectorization.
  2. Convenient and detailed output format.
  3. Compatibility with tests implemented in base R.
  4. Careful handling of missing values and edge cases.

Examples

1. Bartlett's test on columns

Bartlett's test on every column of iris dataset using Species as groups:

col_bartlett(iris[,-5], iris$Species)
             obs.tot obs.groups var.pooled df statistic                pvalue
Sepal.Length     150          3 0.26500816  2 16.005702 0.0003345076070163084
Sepal.Width      150          3 0.11538776  2  2.091075 0.3515028004158132768
Petal.Length     150          3 0.18518776  2 55.422503 0.0000000000009229038
Petal.Width      150          3 0.04188163  2 39.213114 0.0000000030547839322

2. Welch t-test on rows

Welch t-test performed on each row of 2 large (million row) matrices:

X <- matrix(rnorm(10000000), ncol = 10)
Y <- matrix(rnorm(10000000), ncol = 10)

row_t_welch(X, Y)  # running time: 2.4 seconds

Confidence interval computations can be turned-off for further increase in speed:

row_t_welch(X, Y, conf.level = NA)  # running time: 1 second

Available Tests

Variant Name Function
Location tests (1 group) Single sample Student's t.test row_t_onesample(x)
Single sample Wilcoxon's test row_wilcoxon_onesample(x)
Location tests (2 groups) Equal variance Student's t.test row_t_equalvar(x, y)
Welch adjusted Student's t.test row_t_welch(x, y)
Two sample Wilcoxon's test row_wilcoxon_twosample(x, y)
Location tests (paired) Paired Student's t.test row_t_paired(x, y)
Paired Wilcoxon's test row_wilcoxon_paired(x, y)
Location tests (2+ groups) Equal variance oneway anova row_oneway_equalvar(x, g)
Welch's oneway anova row_oneway_welch(x, g)
Kruskal-Wallis test row_kruskalwallis(x, g)
van der Waerden's test row_waerden(x, g)
Scale tests (2 groups) F variance test row_f_var(x, y)
Scale tests (2+ groups) Bartlett's test row_bartlett(x, g)
Fligner-Killeen test row_flignerkilleen(x, g)
Levene's test row_levene(x, g)
Brown-Forsythe test row_brownforsythe(x, g)
Association tests Pearson's correlation test row_cor_pearson(x, y)
Periodicity tests Cosinor row_cosinor(x, t, period)
Distribution tests Jarque-Bera test row_jarquebera(x)
Anderson-Darling test row_andersondarling(x)

Further Information

For more information please refer to the Wiki page:

  1. Installation Instructions
  2. Design Decisions
  3. Speed Benchmarks
  4. Bug Fixes and Improvements to Base R

See Also

Literature

Computing thousands of test statistics simultaneously in R, Holger Schwender, Tina Müller.
Statistical Computing & Graphics. Volume 18, No 1, June 2007.

Packages

CRAN:

  1. ttests() in the Rfast package.
  2. row.ttest.stat() in the metaMA package.
  3. MultiTtest() in the ClassComparison package.
  4. bartlettTests() in the heplots package.
  5. harmonic.regression() in the HarmonicRegression package.

BioConductor:

  1. lmFit() in the limma package.
  2. rowttests() in the genefilter package.
  3. mt.teststat() in the multtest package.
  4. row.T.test() in the HybridMTest package.
  5. rowTtest() in the viper package.
  6. lmPerGene() in the GSEAlm package.

GitHub:

  1. rowWilcoxonTests() in the sanssouci package.
  2. matrix.t.test() in the pi0 package.
  3. wilcoxauc() in the presto package.