cwl-wrapper

CWL wrapper to add stage-in and stage-out to a base app package


License
Apache-2.0
Install
pip install cwl-wrapper==0.12.3

Documentation

Build Status


Logo

CWL-WRAPPER


Open Design . Report Bug · Request Feature

Table of Contents

About The Project

Getting Started & Usage

Installation

Via conda

conda install -c eoepca cwl-wrapper

Development

Clone this reposotory, then create the conda environment with:

cd cwl-wrapper
conda env create -f environment.yml
conda activate env_cwl_wrapper

Use setuptools to install the project:

python setup.py install

Check the installation with:

cwl-wrapper --help

Requirements

  • Python
  • console

Python requirements

  • jinja2
  • pyyaml
  • click
  • click-config-file

Configuration

The rules, that establish connections and conventions with the user cwl, are defined in the cwl-wrapper configuration file.

Rules

rulez:
  version: 1

rulez -> version defines the Rules version. Currently only version 1 is managed

parser:
  driver: cwl

parser -> driver defines the type of objects to be parsed


onstage:
  driver: cwl

  stage_in:
    connection_node: node_stage_in
    if_scatter:
      scatterMethod: dotproduct

  on_stage:
    connection_node: on_stage

  stage_out:
    connection_node: node_stage_out

The onstage configuration is applied to maincwl.yaml file

onstage -> driver defines the driver to use during the translation: The result must be a CWL format

onstage -> stage_in

onstage -> stage_in -> connection_node defines the anchor node name for stage-in start. If the node does not exist, the parser creates it.

onstage -> stage_in -> if_scatter defines the conditions for scatter methods

onstage -> stage_in -> if_scatter -> scatterMethod is the method to use for scatter feature

onstage -> on_stage

onstage -> on_stage -> connection_node defines the anchor node name for user node. If the node does not exist, the parser creates it.

onstage -> stage_out -> connection_node defines the anchor node name for stage-out start. If the node does not exist, the parser creates it.

The stage_in, stage_out and on-stage nodes can be customized by user.

The Parser uses the node name as an anchor to start the phase.

Base template example

class: Workflow
doc: Main stage manager
id: stage-manager
label: theStage
inputs: []
outputs: {}

requirements:
  SubworkflowFeatureRequirement: {}
  ScatterFeatureRequirement: {}

output:
  driver: cwl
  name: '-'
  type: $graph

output -> driver defines the output driver, currently is defined only 'CWL' driver

output -> name this parameter is deprecated

output -> type defines the type of output

Driver CWL needs the templates to define the types:

  GlobalInput:
    Directory: string
    Directory[]: string[]

defines the rules to replace the elements from user type to WPS type. example: user workflow changes in


  stage_in:
    Directory:
      type: string
      inputBinding:
        position: 2

    Directory[]:
      type: string[]
      inputBinding:
        position: 2

are the templates to link the user inputs example


  stage_out:
    Directory:
      type: Directory
      inputBinding:
        position: 6

    Directory[]:
      type: Directory[]
      inputBinding:
        position: 6

defines the template of stage-output and depends from the user output type

Usage

The cwl-wrapper requires

  • user CWL

Examples

In this section we will study how to create and change cwl-wrapper templates:

  • src/cwl_wrapper/assets/stagein.yaml
  • src/cwl_wrapper/assets/stageout.yaml
  • src/cwl_wrapper/assets/maincwl.yaml

Default run

$graph:
  - baseCommand: vegetation-index
    class: CommandLineTool
    hints:
      DockerRequirement:
        dockerPull: eoepca/vegetation-index:0.2
    id: clt
    inputs:
      inp1:
        inputBinding:
          position: 1
          prefix: --input_reference
        type: Directory
      inp2:
        inputBinding:
          position: 2
          prefix: --aoi
        type: string
    outputs:
      results:
        outputBinding:
          glob: .
        type: Directory
    requirements:
      EnvVarRequirement:
        envDef:
          PATH: /opt/anaconda/envs/env_vi/bin:/opt/anaconda/envs/env_vi/bin:/home/fbrito/.nvm/versions/node/v10.21.0/bin:/opt/anaconda/envs/notebook/bin:/opt/anaconda/bin:/usr/share/java/maven/bin:/opt/anaconda/bin:/opt/anaconda/envs/notebook/bin:/opt/anaconda/bin:/usr/share/java/maven/bin:/opt/anaconda/bin:/opt/anaconda/condabin:/opt/anaconda/envs/notebook/bin:/opt/anaconda/bin:/usr/lib64/qt-3.3/bin:/usr/share/java/maven/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/home/fbrito/.local/bin:/home/fbrito/bin:/home/fbrito/.local/bin:/home/fbrito/bin
          PREFIX: /opt/anaconda/envs/env_vi
      ResourceRequirement: {}
    stderr: std.err
    stdout: std.out
  - class: Workflow
    doc: Vegetation index processor, the greatest
    id: vegetation-index
    inputs:
      aoi:
        doc: Area of interest in WKT
        label: Area of interest
        type: string
      input_reference:
        doc: EO product for vegetation index
        label: EO product for vegetation index
        type: Directory[]
      input_reference2:
        doc: EO product for vegetation index
        label: EO product for vegetation index
        type: Directory[]
    label: Vegetation index
    outputs:
      - id: wf_outputs
        outputSource:
          - node_1/results
        type:
          items: Directory
          type: array
    requirements:
      - class: ScatterFeatureRequirement
    steps:
      node_1:
        in:
          inp1: input_reference
          inp2: aoi
        out:
          - results
        run: '#clt'
        scatter: inp1
        scatterMethod: dotproduct
cwlVersion: v1.0
python cwl-wrapper assets/vegetation.cwl  --output  assets/vegetation.wf.yaml

expected result is the file vegetation.wf.yaml

In the new file, have been added the elements:

New Stage-in

In the new stage-in we are going to add two new parameters

  • parameter_A
  • paraneter_B
baseCommand: stage-in
class: CommandLineTool
hints:
  DockerRequirement:
    dockerPull: eoepca/stage-in:0.2
id: stagein
arguments:
  - prefix: -t
    position: 1
    valueFrom: "./"

inputs:
    parameter_A:
      doc: EO product for vegetation index
      label: EO product for vegetation index
      type: string[]
    parameter_B:
      doc: EO product for vegetation index
      label: EO product for vegetation index
      type: string[]
outputs: {}
requirements:
  EnvVarRequirement:
    envDef:
      PATH: /opt/anaconda/envs/env_stagein/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
  ResourceRequirement: {}

the inputs can be written in Dict or List format

In the new run, we have to update the parameter stagein:

python cwl-wrapper assets/vegetation.cwl  --stagein assets/stagein-test.cwl --output  vegetation.wf_new_stagein.yaml

In the new output file vegetation.wf_new_stagein.yaml have been added:

New Stage-out

The Stage-out template responds at the same rules of stage-in template, we only need to change the run parameters

python cwl-wrapper assets/vegetation.cwl  --stageout assets/stagein-test.cwl --output  vegetation.wf_new_stageout.yaml

The maincwl.yaml is the workflow where the cwl-wrapper pastes all the user templates creating a new cwl workflow

maincwl.yaml works with the rules file where are defined the connection rules

In rules file are defined three entry points which they'll created or will linked to new workflow:

Now we can try to change the maincwl.yaml adding a new custom step before the stage-in

class: Workflow
doc: Main stage manager
id: stage-manager
label: theStage
inputs:
  myinputs:
      doc: myinputs doc
      label: myinputs label
      type: string
outputs: {}
requirements:
  SubworkflowFeatureRequirement: {}
  ScatterFeatureRequirement: {}
steps:
    custom_node:
      in:
        myinputs: myinputs
      out:
      - example_out
      run:
        class: CommandLineTool
        baseCommand: do_something
        inputs:
          myinputs:
            type: string
            inputBinding:
              prefix: --file
        outputs:
          example_out:
            type: File
            outputBinding:
              glob: hello.txt
    node_stage_in:
      in:
        custom_input: custom_node/example_out
      out: []
      run: ''

Run

python cwl-wrapper assets/vegetation.cwl  --maincwl  ../assets/custom_main.cwl --output  vegetation.wf.custom_maincwl.yaml

The output file vegetation.wf.custom_maincwl.yaml

Roadmap

See the open issues for a list of proposed features (and known issues).

Contributing

Contributions are what make the open source community such an amazing place to be learn, inspire, and create. Any contributions you make are greatly appreciated.

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

License

Distributed under the Apache-2.0 License. See LICENSE for more information.

Contact

Terradue - @terradue - info@terradue.com

Project Link: https://github.com/EOEPCA/proc-ades

Acknowledgements

Try me on Binder

Binder