_________________________
< OarphPy!! Oarph! Oarph! >
< OarphKit for Python!! >
-------------------------
\
\
____ __ ___ -~~~~-
/ __ \___ ________ / / / _ \__ __|O __ O|
/ /_/ / _ `/ __/ _ \/ _ \/ ___/ // /|_\__/_|__-
\____/\_,_/_/ / .__/_//_/_/ \_,---(__/\__)---
.--/_/ /___/ / ~--~ \
,__;` o __`'. _,..-/ | \/ | \
' `'---' `'.'. .'.'` | | /\ | |
.'-...-`.' _/ /\__ __/\ \_
-...-` ~~~~~ ~~~~ ~~~~~
OarphPy is a collection of Python utilities for Data Science with PySpark and Tensorflow. Related (but orthogonal) to OarphKit.
Quickstart
Install from PyPI: pip install oarphpy
. We test OarphPy in a variet of
environments (see below), so it should play well with your Jupyter/Colab
notebook or project environment. To include all extras, use
pip install oarphpy[all]
.
Or use the dockerized environment hosted on DockerHub:
$ ./oarphcli --shell
-- or --
$ docker run -it --net=host oarphpy/full bash
See also API documentation.
Demos
Dockerized Development Environments
OarphPy is built and tested in a variety of environments to ensure the library works with and without optional dependencies. These environments are shared on DockerHub and defined in the docker subdirectory of this repo:
-
oarphpy/full
-- Includes Tensorflow, Jupyter, a binary install of Spark, and other tools like Bokeh. Use this environment for adhoc data science or as a starter for other projects. -
oarphpy/base-py2
-- Testsoarphpy
in a vanilla Python 2.7 environment to ensure clean interop with other projects. -
oarphpy/base-py3
-- Testsoarphpy
in a vanilla Python 3 environment to ensure clean interop with other projects. -
oarphpy/spark
-- Testsoarphpy
with a vanilla install of PySpark to ensure basic compatibility. -
oarphpy/tensorflow
-- Testsoarphpy
with Tensorflow 1.x to ensure basic compatibility (e.g. ofoarphpy.util.tfutil
).
Development
See ./oarphcli --help
for the development and release workflow.