Hive import statement generator for Parquet datasets


License
MPL-2.0
Install
pip install parquet2hive==0.3.0

Documentation

parquet2hive Build Status

Hive import statement generator for Parquet datasets. Supports versioned datasets and schema evolution.

Installing from Pypi

To install this package from Pypi, run:

pip install parquet2hive

Updating the Package on PyPi

To upload the most recent version, run:

python setup.py sdist upload

Using the TestPypi Servers

You will need a separate account on https://testpypi.python.org. To upload the file to the pypi test servers, ensure your ~/.pypirc contains the following:

[distutils]
index-servers=
    pypi
    pypitest

[pypitest]
repository = https://testpypi.python.org/pypi
username = testpypi_username 
password = testpypi_password 

[pypi]
repository = https://pypi.python.org/pypi
username = pypi_username 
password = pypi_password   

Upload the code using:

python setup.py sdist upload -r https://testpypi.python.org/pypi

Finally, pull the most recent package from the test-repository on any machine using:

pip install parquet2hive -i https://testpypi.python.org/pypi

Example usage

parquet2hive s3://telemetry-parquet/longitudinal | bash

To see the allowed command line interface arguments, run parquet2hive -h