Placeholder description

pip install python-awk==0.0.10


pawk is a python-based replacement for awk.

It uses python for line-by-line processing of files


#pawk automatically reads lines as csv rows and stores the result as a list in "r"
#-g ("grep") keeps a subset of lines satisfying a given condition

#Selects lines from input.txt with at least 3 csv fields
> pawk -f input.txt -g 'len(r) > 2'

#Keep a subset of lines where the second csv field is non-empty
> pawk -f input.txt -g 'r[1]'

#The above may crash if some lines have only one csv field
#Use this instead:
> pawk -f input.txt -g 'len(r) > 1 and r[1]'

#The raw line is stored in the "l" variable
#Keep a subset of lines where l isn't empty and the first character is "a"
> pawk -f input.txt -g 'l != "" and l[0] == "a"'

#Run certain code for each input line using -p
#Using -p prevents the default printing of the line

#For each line of the input, print that line with whitespace stripped
> pawk -f input.txt -p 'print l.strip()'

#default value of -f is /dev/stdin
> less input.txt | pawk -p 'print len(r)'

#-d sets the input delimiter
#the output delimiter is ",", so this command converts a tsv to a csv
> pawk -f input.txt -d '\t'

#pawk store the line number (zero-indexed) in the "i" variable
#only keep lines starting with the 1133rd
> pawk -f input.txt -g 'i>=1132'

#replace a regular expression from each line (python re module imported by default)
> pawk -f input.txt -p 'print re.sub("U_C_Rate","firearm_rate",l)'

#-b runs code before any lines are processed
#-e runs code after all lines are processed
#To add up a list of floats
> pawk -f input.txt -b "cnt=0" -p "cnt += float(l)" -e "print cnt"

Writing multi-line python in pawk:
Heavily inspired by a source I can't find right now, pawk can process strings representing multi-line python.

#(semi-colon) or (colon+whitespace) causes a line break
'import random; print(random.random())'
import random;

#after lines with (colon+whitespace) successive lines are automatically indented:
'if i>3: print("hello world!"); a += 1; b = 0'
if i>3:
   print("hello world!");
   a += 1;
   b == 0

#use the 'end;' keyword to force indent level to decrease (compare this example with the above)
'if i>3: print("hello world!"); end; a += 1; b = 0'
if i>3:
   print("hello world!");
a += 1;
b = 0

#"elif:", "else:" and "except:" automatically cause indenting to decrease
'if i>3: print("a"); elif i>1: print("b"); else: print("c")'
if i>3:
elif i>1:

#you can define functions!
'def test123(): print("hello world!"); end; test123(); test123(); test123();'
def test123():
    print("hello world!");