unvcf

Split VCF file into intelligible tab-delimited files


Keywords
vcf, file, genotype, variant, call, converter
License
MIT
Install
pip install unvcf==0.1.2

Documentation

unvcf

Travis AppVeyor

Split VCF file into plain tab-delimited files.

VCF stands for Variant Call Format, a widely used in bioinformatics for encoding structural variations. Although it has been inspired by CSV files to (I suppose) make VCF easy to read by humans and easy to parse by machines, it is nowadays hardly doing well in both instances. It is a file format whose full specification cannot be given before-hand as each VCF files is free to specify the form of its fields. As a result, the first step to process data contained in a VCF file is often extracting a subset of it and converting the extract data into a more amenable file format.

This command-line Python package aims to tackle the above problems. It will split a VCF format into standard CSV files, each of which containing a different field of the original VCF file. Therefore facilitating making it both easier to read by humans and machines again.

Install

Assuming you have a working Python installation, you will almost certainly have all the requirements to install unvcf. From a terminal, enter:

pip install unvcf

Usage

After the installation, you will have access to unvcf from a terminal. In which case you can start using it as follows:

unvcf path_to_file.vcf destination_folder/

The above command will produce CSV files that represent the original fields of path_to_file.vcf. For more information, you can enter

unvcf --help

If by any change you face a problem or have a question, please, create a new issue, and we will try to sort it out as soon as possible.

Authors

License

This project is licensed under the MIT License.