subsample

Subsample data from a uniform distribution so two different statuses for the same entity have the same number of samples.


Keywords
gpl, library, program, Propose Tags
License
CNRI-Python-GPL-Compatible
Install
cabal install subsample-0.1.0.0

Documentation

subsample

subsample, Gregory W. Schwartz. Subsample data from a uniform distribution so
all statuses for all entities have the same number of samples.

Usage: subsample [--sampleCol TEXT] [--statusCol TEXT] [--n INT]

Available options:
  -h,--help                Show this help text
  --sampleCol TEXT         ([name] | COLUMN) The column containing the names of
                           the samples.
  --statusCol TEXT         ([status] | COLUMN) The column containing the
                           statuses of the entities.
  --n INT                  ([MINIMUM] | INT) The number of samples to pull from
                           each status. The default value is the sample number
                           of the status with the smallest number of samples.