dnaseq_validation

validate dnaseq workflow based on fastqc, picard and samtools metrics


License
Apache-2.0
Install
pip install dnaseq_validation==0.5

Documentation

dnaseq_validation

Example Run:

$ cwltool dnaseq_validation.cwl --input_json_expected_metrics /mnt/SCRATCH/expected_NA12878.chrom20.ILLUMINA.bwa.CEU.low_coverage.20121211.json --input_sqlite_metrics /mnt/SCRATCH/123e4567-e89b-12d3-a456-426655440000.db 
/home/ubuntu/.virtualenvs/cwl1/bin/cwltool 1.0.20170329142446
Resolved 'dnaseq_validation.cwl' to 'file:///mnt/SCRATCH/gdc-dnaseq-cwl/tools/dnaseq_validation.cwl'
[job dnaseq_validation.cwl] /tmp/tmprnYRy1$ docker \
    run \
    -i \
    --volume=/mnt/SCRATCH/123e4567-e89b-12d3-a456-426655440000.db:/var/lib/cwl/stge22a4110-744a-4e2c-982a-dd9f4744be49/123e4567-e89b-12d3-a456-426655440000.db:ro \
    --volume=/mnt/SCRATCH/expected_NA12878.chrom20.ILLUMINA.bwa.CEU.low_coverage.20121211.json:/var/lib/cwl/stg796c3a99-3cf0-4b32-a9f9-91bfbcedeb35/expected_NA12878.chrom20.ILLUMINA.bwa.CEU.low_coverage.20121211.json:ro \
    --volume=/tmp/tmprnYRy1:/var/spool/cwl:rw \
    --volume=/tmp/tmpH6jfLY:/tmp:rw \
    --workdir=/var/spool/cwl \
    --read-only=true \
    --user=1000 \
    --rm \
    --env=TMPDIR=/tmp \
    --env=HOME=/var/spool/cwl \
    quay.io/ncigdc/dnaseq_validation \
    /usr/local/bin/dnaseq_validation \
    --input_json_expected_metrics \
    /var/lib/cwl/stg796c3a99-3cf0-4b32-a9f9-91bfbcedeb35/expected_NA12878.chrom20.ILLUMINA.bwa.CEU.low_coverage.20121211.json \
    --input_sqlite_metrics \
    /var/lib/cwl/stge22a4110-744a-4e2c-982a-dd9f4744be49/123e4567-e89b-12d3-a456-426655440000.db
[job dnaseq_validation.cwl] completed success
{
    "log": {
        "checksum": "sha1$20da7709a0aabf646ce6dc57518e69f9c38c43db", 
        "basename": "test_results.log", 
        "location": "file:///mnt/SCRATCH/gdc-dnaseq-cwl/tools/test_results.log", 
        "path": "/mnt/SCRATCH/gdc-dnaseq-cwl/tools/test_results.log", 
        "class": "File", 
        "size": 2104
    }, 
    "results": {
        "checksum": "sha1$1c5d4960ac6c205f1648072bc59a604988c9d3d8", 
        "basename": "test_results.json", 
        "location": "file:///mnt/SCRATCH/gdc-dnaseq-cwl/tools/test_results.json", 
        "path": "/mnt/SCRATCH/gdc-dnaseq-cwl/tools/test_results.json", 
        "class": "File", 
        "size": 946
    }
}
Final process status is success

Example test_result.json output:

{
    "overall": true,
    "steps": {
        "average_quality_samtools_stats": true,
        "bases_duplicated_samtools_stats": true,
        "bases_mapped_samtools_stats": true,
        "count_fastq_files": true,
        "count_files_output": true,
        "count_readgroups": true,
        "pairs_diff_chr_samtools_stats": true,
        "pairs_other_orient_samtools_stats": true,
        "raw_total_seq_samtools_stats": true,
        "read_pair_dups_picard_markduplicates": true,
        "read_pairs_picard_markduplicates": true,
        "read_unmapped_samtools_stats": true,
        "reads_dup_samtools_stats": true,
        "reads_mapped_and_paired_samtools_stats": true,
        "reads_mapped_samtools_stats": true,
        "reads_mq0_samtools_stats": true,
        "reads_paired_samtool_stats": true,
        "reads_prop_paired_samtools_stats": true,
        "seqs_samtools_stats": true,
        "total_length_samtools_stats": true
    }
}

Example test_results.log output:

2017-04-14_16:15:43_UTC INFO average_quality_samtools_stats: expected value 34.3 matches test value 34.3
2017-04-14_16:15:43_UTC INFO bases_duplicated_samtools_stats: expected value 19719543 matches test value 19719543
2017-04-14_16:15:43_UTC INFO bases_mapped_samtools_stats: expected value 327208084 matches test value 327208084
2017-04-14_16:15:43_UTC INFO count_fastq_files: expected value 4 matches test value 4
2017-04-14_16:15:43_UTC INFO count_files_output: expected value 2 matches test value 2
2017-04-14_16:15:43_UTC INFO count_readgroups: expected value 1 matches test value 1
2017-04-14_16:15:43_UTC INFO pairs_diff_chr_samtools_stats: expected value 36383 matches test value 36383
2017-04-14_16:15:43_UTC INFO pairs_other_orient_samtools_stats: expected value 1810 matches test value 1810
2017-04-14_16:15:43_UTC INFO raw_total_seq_samtools_stats: expected value 3241645 matches test value 3241645
2017-04-14_16:15:43_UTC INFO read_pair_dups_picard_markduplicates: expected value 96481 matches test value 96481
2017-04-14_16:15:43_UTC INFO read_pairs_picard_markduplicates: expected value 1608166 matches test value 1608166
2017-04-14_16:15:43_UTC INFO read_unmapped_samtools_stats: expected value 1961 matches test value 1961
2017-04-14_16:15:43_UTC INFO reads_dup_samtools_stats: expected value 195243 matches test value 195243
2017-04-14_16:15:43_UTC INFO reads_mapped_and_paired_samtools_stats: expected value 3216332 matches test value 3216332
2017-04-14_16:15:43_UTC INFO reads_mapped_samtools_stats: expected value 3239684 matches test value 3239684
2017-04-14_16:15:43_UTC INFO reads_mq0_samtools_stats: expected value 51754 matches test value 51754
2017-04-14_16:15:43_UTC INFO reads_paired_samtool_stats: expected value 3220218 matches test value 3220218
2017-04-14_16:15:43_UTC INFO reads_prop_paired_samtools_stats: expected value 3096912 matches test value 3096912
2017-04-14_16:15:43_UTC INFO seqs_samtools_stats: expected value 3241645 matches test value 3241645
2017-04-14_16:15:43_UTC INFO total_length_samtools_stats: expected value 327406145 matches test value 327406145