convert any python function to unix-style command

unix command pipes cli, cli, command, pipes, python, python3, shell, shellscript, unix
pip install cbox==0.5.0


CBOX - CLI ToolBox

PyPI PyPI Build Status AppVeyor Codecov PyPI PyPI

convert any python function to unix-style command

The Unix Philosophy (from wikipedia):

  • Write programs that do one thing and do it well.
  • Write programs to work together.
  • Write programs to handle text streams, because that is a universal interface.


  • supports pipes
  • concurrency (threading or asyncio)
  • supports error handling (redirected to stderr)
  • supports for inline code in cli style
  • various output processing options (filtering, early stopping..)
  • supports multiple types of pipe processing (lines, chars..)
  • automatic docstring parsing for description and arguments help
  • automatic type annotation and defaults parsing
  • returns the correct exitcode based on errors
  • supports only python3 (yes this is a feature)
  • supports subcommands



pip install -U cbox

example usage:

#!/usr/bin/env python3
# hello.py
import cbox

def hello(name: str):
    """greets a person by its name.

    :param name: the name of the person
    print(f'hello {name}!')

if __name__ == '__main__':

run it:

$ ./hello.py --name world
hello world!

$ ./hello.py --help
usage: hello.py [-h] --name NAME

greets a person by its name.

optional arguments:
  -h, --help   show this help message and exit
  --name NAME  the name of the person

cli inline example:

$ echo -e "\n192.168.2.3\ngoogle.com" | cbox --modules re 're.findall("(?:\d+\.)+\d+", s)'

for more info about cbox inline run cbox --help

The Story

once upon a time, a python programmer named dave, had a simple text file.


python http://python.org
lisp http://lisp-lang.org
ruby http://ruby-lang.org

all dave wanted is to get the list of languages from that file.

our dave heard that unix commands are the best, so he started googling them out.

he started reading about awk, grep, sed, tr, cut and others but couldn't remember how to use all of them - after all he is a python programmer and wants to use python.

fortunately, our little dave found out about cbox - a simple way to convert any python function into unix-style command line!

now dave can process files using python easily!

simple example

#!/usr/bin/env python3
# first.py
import cbox

def first(line):
    return line.split()[0]

if __name__ == '__main__':

running it:

$ cat langs.txt | ./first.py 

or inline cli style:

$ cat langs.txt | cbox 's.split()[0]'

note: s is the input variable

now dave is satisfied, so like every satisfied programmer - he wants more!

dave now wants to get a list of the langs urls.

arguments and help message

#!/usr/bin/env python3
# nth-item.py
import cbox

# we can pass default values and use type annotations for correct types
def nth_item(line, n: int = 0):
    """returns the nth item from each line.

    :param n: the number of item position starting from 0
    return line.split()[n]

if __name__ == '__main__':

running it:

#!/usr/bin/env python3
$ ./nth-item.py --help
usage: nth-item.py [-h] [-n N]

returns the nth item from each line.

optional arguments:
  -h, --help  show this help message and exit
  -n N        the number of item position starting from 0
$ cat langs.txt | ./nth-item.py 
$ cat langs.txt | ./nth-item.py -n 1

now dave wants to get the status out of each url, for this we can use requests.

but to process a large list it will take too long, so he better off use threads.

threading example

#!/usr/bin/env python3
# url-status.py
import cbox
import requests

@cbox.stream(worker_type='thread', max_workers=4)
def url_status(line):
    resp = requests.get(line)
    return f'{line} - {resp.status_code}'

if __name__ == '__main__':

running it:

$ cat langs.txt | ./nth-line.py -n 1 | ./url-status.py 
http://python.org - 200
http://lisp-lang.org - 200
http://ruby-lang.org - 200

or inline cli style

$ cat langs.txt | cbox 's.split()[1]' | cbox -m requests  -w thread -c 4 'f"{s} - {requests.get(s).status_code}"'
http://python.org - 200
http://lisp-lang.org - 200
http://ruby-lang.org - 200

Advanced Usage

Error handling

#!/usr/bin/env python3
# numbersonly.py
import cbox

def numbersonly(line):
    """returns the lines containing only numbers. bad lines reported to stderr.
    if any bad line is detected, exits with exitcode 2.
    if not line.isnumeric():
        raise ValueError('{} is not a number'.format(line))
    return line

if __name__ == '__main__':

all errors are redirected to stderr:

$ echo -e "123\nabc\n567" | ./numbersonly.py
Traceback (most recent call last):
  File "/home/shmulik/cs/cbox/cbox/concurrency.py", line 54, in _simple_runner
    yield func(item, **kwargs), None
  File "numbersonly.py", line 11, in numbersonly
    raise ValueError('{} is not a number'.format(line))
ValueError: abc is not a number


we can ignore the stderr stream by redirecting it to /dev/null:

$ echo -e "123\nabc\n567" | ./numbersonly.py 2>/dev/null

our command returns 2 as the exit code, indicating an error, we can get the last error code by running echo $?:

$ echo $?


cbox.stream supports three types of return values - str, None and iterable of strs.

None skips and outputs nothing, str is outputted normally and each item in the iterable is treated as str.

here is a simple example:

#!/usr/bin/env python3
# extract-domains.py
import re
import cbox

def extract_domains(line):
    """tries to extract all the domains from the input using simple regex"""
    return re.findall(r'(?:\w+\.)+\w+', line) or None  # or None can be omitted

if __name__ == '__main__':

we can now run it (notice that we can have multiple domains or zero domains on each line):

$ echo -e "google.com cbox.com\nhello\nfacebook.com" | ./extract-domains.py 

Early Stopping

cbox.stream supports early stopping, i.e. stopping before reading the whole stdin

example implementing a simple head command

#!/usr/bin/env python3
# head.py
import cbox

counter = 0

def head(line, n: int):
    """returns the first `n` lines"""
    global counter
    counter += 1

    if counter > n:
        raise cbox.Stop()  # can also raise StopIteration()
    return line

if __name__ == '__main__':

getting the first 2 lines:

$ echo -e "1\n2\n3\n4" | ./head.py -n 2


cbox supports simple (default), asyncio and thread workers. we can use asyncio like this:

#!/usr/bin/env python3
# tcping.py
import asyncio
import cbox

@cbox.stream(worker_type='asyncio', workers_window=30)
async def tcping(domain, timeout: int=3):
    loop = asyncio.get_event_loop()

    fut = asyncio.open_connection(domain, 80, loop=loop)
        reader, writer = await asyncio.wait_for(fut, timeout=timeout)
        status = 'up'
    except (OSError, asyncio.TimeoutError):
        status = 'down'

    return '{} is {}'.format(domain, status)

if __name__ == '__main__':

this will try open up to 30 connections in parallel using asyncio.

running it:

$ echo -e "\n192.168.2.3\ngoogle.com"  | ./tcping.py is down is down
google.com is up

more examples can be found on examples/ dir


cbox is an open source software and intended for everyone. please feel free to create PRs, add examples to examples/ dir, request features and ask questions.

Creating Local Dev Env

after cloning the repo, you'll need to install test dependencies from test-requirements.txt.

there is a simple make command to install them (you'll need miniconda installed):

$ make test-setup

or you can use pip install -r test-requirements.txt (preferably in new virtualenv).

now ensure all tests passes and runs locally:

$ make test