parquet-csv

Parquet from and to CSV format converter


Keywords
parquet, csv, convert, pyarrow, arrow
License
MIT
Install
pip install parquet-csv==0.0.1

Documentation

Parquet_CSV

CI | PyPI

A Parquet to and from CSV converter that is based on Apache Arrow for its speed and memory efficiency.

How to install

pip install parquet_csv

Use pip3 if both Python2 and Python3 are installed. This application only works with Python3.

How to use

Converting Parquet

parquet_to_csv converts parquet files to csv files. By default it prints to the standard output, but can be directed via pipe or -o flag to write to a file.

Usage: parquet_to_csv.py [OPTIONS] INPUT_FILE

Options:
  -o, --output-path FILE  [default: (standard output)]
  --header / --no-header
  --verbose BOOLEAN
  --help                  Show this message and exit.

Selecting columns, gzip-ing output

Following UNIX principle, you should be using xsv for selecting columns from the csv or do other transformations: just pipe the output to xsv and you're all set.

Similarly if you'd want the file to be compressed, pipe the result to gzip and direct to a local file ending in .csv.gz.