#PySummary
pysummary.py is a python script for summary statistics. As a command pysummary.py reads from stand input, uses space as fields delimiter by defualt and outputs to stand output. pysummary.py can also be used as a python module.
Install as a command
- download/clone the repository
- Unix/Linux
make install
, pysummary.py will be installed to/usr/local/bin
- Windows users please read tips first
Install from Pip
pip install pysummary
Usage
pysummary.py [options]
Options
Option | Description |
---|---|
-f# | field/column index (start from 1) |
-d# | delimiter |
-s# | skip first # lines |
-p# | set print precision |
-c# | set confidence |
-i# | NA value to ignore |
-h | print help |
Examples
- Use case: basic summary on single column text file
-
cat data.txt | pysummary.py
Unix/Linux, PowerShell on Windows -
pysummary.py < data.txt
Unix/Linux, cmd.exe on Windows - Output:
_______Field = 1
_______Lines = 26
________Mean = 4.80769
____Variance = 4.77071
______StdDev = 2.18420
_________Sum = 125.00000
_________Min = 0.00000
_________Max = 9.00000
______Median = 5.00000
__Confidence = 0.95000
___Cnf.Itv.L = 3.90801
___Cnf.Itv.U = 5.70738
- Use case: summarize 2nd field of a comma separated values (csv) file, skip first line (header)
- Input:
"c1","c2","c3","c4"
1,2,3,4
5,6,7,8
9,0,1,2
3,4,5,6
cat data.csv | pysummary.py -d',' -f2 -s1
- Output:
_______Field = 2
_______Lines = 4
________Mean = 3.00000
____Variance = 5.00000
______StdDev = 2.23607
_________Sum = 12.00000
_________Min = 0.00000
_________Max = 6.00000
______Median = 3.00000
__Confidence = 0.95000
___Cnf.Itv.L = -1.10852
___Cnf.Itv.U = 7.10852
- Use pysummary as a module
from pysummary import *
with open('data.txt') as f:
res = stats(stream = f, field=1, delimiter=' ', skip = 0, confidence=0.95)
print res
# print res.mean, res.variance ...
output
1 26 4.80769 4.77071 2.18420 125.00000 0.00000 9.00000 5.00000 0.95000 3.90801 5.70738
supported properties (in printing order)
field
lines
mean
variance
std_dev
sum
min
max
median
confidence
low_limit
high_limit
Tips
- Make pystats.py as a command on Windows
- create a bin folder under current user's home folder. PowerShell
mkdir ~/bin
- copy pysummary.py to that folder. PowerShell
cp pysummary.py ~/bin
- add
C:/Users/YOURNAME/bin
toPATH
variable - add
.py
toPATHEXT
variable to make python script executable
Dependencies
- NumPy
- SciPy
Contact
- Zhonghua Xi xizhonghua@gmail.com