2.4. Inputs#

2.4.1. Essential Input Parameters#

The inputs of a tool is a list of input parameters that control how to run the tool. Each parameter has an id for the name of parameter, and type describing what types of values are valid for that parameter.

Available primitive types are string, boolean, int, long, float, double, and null; complex types are array and record; in addition there are special types File, Directory and Any.

The following example demonstrates some input parameters with different types and appearing on the command line in different ways.

First, create a file called inp.cwl, containing the following:

inp.cwl#
#!/usr/bin/env cwl-runner
cwlVersion: v1.2
class: CommandLineTool
baseCommand: echo
inputs:
  example_flag:
    type: boolean
    inputBinding:
      position: 1
      prefix: -f
  example_string:
    type: string
    inputBinding:
      position: 3
      prefix: --example-string
  example_int:
    type: int
    inputBinding:
      position: 2
      prefix: -i
      separate: false
  example_file:
    type: File?
    inputBinding:
      prefix: --file=
      separate: false
      position: 4

outputs: []

Create a file called inp-job.yml:

inp-job.yml#
example_flag: true
example_string: hello
example_int: 42
example_file:
  class: File
  path: whale.txt

Note

You can use cwltool to create a template input object. That saves you from having to type all the input parameters in an input object file:

$ cwltool --make-template inp.cwl
INFO /opt/hostedtoolcache/Python/3.9.19/x64/bin/cwltool 3.1.20240508115724
INFO Resolved 'inp.cwl' to 'file:///home/runner/work/user_guide/user_guide/src/_includes/cwl/inputs/inp.cwl'
example_string: a_string  # type 'string'
example_int: 0  # type 'int'
example_flag: false  # type 'boolean'
example_file:  # type 'File' (optional)
    class: File
    path: a/file/path

You can redirect the output to a file, i.e. cwltool --make-template inp.cwl > inp-job.yml, and then modify the default values with your desired input values.

Notice that “example_file”, as a File type, must be provided as an object with the fields class: File and path.

Next, create a whale.txt using touch by typing touch whale.txt on the command line.

$ touch whale.txt

Now invoke cwltool with the tool description and the input object on the command line, using the command cwltool inp.cwl inp-job.yml. The following boxed text describes these two commands and the expected output from the command line:

$ cwltool inp.cwl inp-job.yml
INFO /opt/hostedtoolcache/Python/3.9.19/x64/bin/cwltool 3.1.20240508115724
INFO Resolved 'inp.cwl' to 'file:///home/runner/work/user_guide/user_guide/src/_includes/cwl/inputs/inp.cwl'
INFO [job inp.cwl] /tmp/co7p_ml8$ echo \
    -f \
    -i42 \
    --example-string \
    hello \
    --file=/tmp/u_wygi4r/stg2445be6b-7db0-46aa-af4a-af85f3f8ba4d/whale.txt
-f -i42 --example-string hello --file=/tmp/u_wygi4r/stg2445be6b-7db0-46aa-af4a-af85f3f8ba4d/whale.txt
INFO [job inp.cwl] completed success
{}INFO Final process status is success

Tip

Where did those `/tmp` paths come from?

The CWL reference runner (cwltool) and other runners create temporary directories with symbolic (“soft”) links to your input files to ensure that the tools aren’t accidentally accessing files that were not explicitly specified

The field inputBinding is optional and indicates whether and how the input parameter should appear on the tool’s command line. If inputBinding is missing, the parameter does not appear on the command line. Let’s look at each example in detail.

example_flag:
  type: boolean
  inputBinding:
    position: 1
    prefix: -f

Boolean types are treated as a flag. If the input parameter “example_flag” is “true”, then prefix will be added to the command line. If false, no flag is added.

example_string:
  type: string
  inputBinding:
    position: 3
    prefix: --example-string

String types appear on the command line as literal values. The prefix is optional, if provided, it appears as a separate argument on the command line before the parameter . In the example above, this is rendered as --example-string hello.

example_int:
  type: int
  inputBinding:
    position: 2
    prefix: -i
    separate: false

Integer (and floating point) types appear on the command line with decimal text representation. When the option separate is false (the default value is true), the prefix and value are combined into a single argument. In the example above, this is rendered as -i42.

example_file:
  type: File?
  inputBinding:
    prefix: --file=
    separate: false
    position: 4

File types appear on the command line as the path to the file. When the parameter type ends with a question mark ? it indicates that the parameter is optional. In the example above, this is rendered as --file=/tmp/random/path/whale.txt. However, if the “example_file” parameter were not provided in the input, nothing would appear on the command line.

Input files are read-only. If you wish to update an input file, you must first copy it to the output directory.

The value of position is used to determine where parameter should appear on the command line. Positions are relative to one another, not absolute. As a result, positions do not have to be sequential, three parameters with positions 1, 3, 5 will result in the same command line as 1, 2, 3. More than one parameter can have the same position (ties are broken using the parameter name), and the position field itself is optional. The default position is 0.

The baseCommand field will always appear in the final command line before the parameters.

2.4.2. Array Inputs#

It is easy to add arrays of input parameters represented to the command line. There are two ways to specify an array parameter. First is to provide type field with type: array and items defining the valid data types that may appear in the array. Alternatively, brackets [] may be added after the type name to indicate that input parameter is array of that type.

array-inputs.cwl#
#!/usr/bin/env cwl-runner
cwlVersion: v1.2
class: CommandLineTool
inputs:
  filesA:
    type: string[]
    inputBinding:
      prefix: -A
      position: 1

  filesB:
    type:
      type: array
      items: string
      inputBinding:
        prefix: -B=
        separate: false
    inputBinding:
      position: 2

  filesC:
    type: string[]
    inputBinding:
      prefix: -C=
      itemSeparator: ","
      separate: false
      position: 4

outputs:
  example_out:
    type: stdout
stdout: output.txt
baseCommand: echo
array-inputs-job.yml#
filesA: [one, two, three]
filesB: [four, five, six]
filesC: [seven, eight, nine]

Now invoke cwltool providing the tool description and the input object on the command line:

$ cwltool array-inputs.cwl array-inputs-job.yml
INFO /opt/hostedtoolcache/Python/3.9.19/x64/bin/cwltool 3.1.20240508115724
INFO Resolved 'array-inputs.cwl' to 'file:///home/runner/work/user_guide/user_guide/src/_includes/cwl/inputs/array-inputs.cwl'
INFO [job array-inputs.cwl] /tmp/o5496i5q$ echo \
    -A \
    one \
    two \
    three \
    -B=four \
    -B=five \
    -B=six \
    -C=seven,eight,nine > /tmp/o5496i5q/output.txt
INFO [job array-inputs.cwl] completed success
{
    "example_out": {
        "location": "file:///home/runner/work/user_guide/user_guide/src/_includes/cwl/inputs/output.txt",
        "basename": "output.txt",
        "class": "File",
        "checksum": "sha1$91038e29452bc77dcd21edef90a15075f3071540",
        "size": 60,
        "path": "/home/runner/work/user_guide/user_guide/src/_includes/cwl/inputs/output.txt"
    }
}INFO Final process status is success
$ cat output.txt
-A one two three -B=four -B=five -B=six -C=seven,eight,nine

The inputBinding can appear either on the outer array parameter definition or the inner array element definition, and these produce different behavior when constructing the command line, as shown above. In addition, the itemSeparator field, if provided, specifies that array values should be concatenated into a single argument separated by the item separator string.

Note that the arrays of inputs are specified inside square brackets [] in array-inputs-job.yml. Arrays can also be expressed over multiple lines, where array values that are not defined with an associated key are marked by a leading -. This will be demonstrated in the next lesson and is discussed in more detail in the YAML Guide. You can specify arrays of arrays, arrays of records, and other complex types.

2.4.3. Inclusive and Exclusive Inputs#

Sometimes an underlying tool has several arguments that must be provided together (they are dependent) or several arguments that cannot be provided together (they are exclusive). You can use records and type unions to group parameters together to describe these two conditions.

record.cwl#
#!/usr/bin/env cwl-runner
cwlVersion: v1.2
class: CommandLineTool
inputs:
  dependent_parameters:
    type:
      type: record
      name: dependent_parameters
      fields:
        itemA:
          type: string
          inputBinding:
            prefix: -A
        itemB:
          type: string
          inputBinding:
            prefix: -B
  exclusive_parameters:
    type:
      - type: record
        name: itemC
        fields:
          itemC:
            type: string
            inputBinding:
              prefix: -C
      - type: record
        name: itemD
        fields:
          itemD:
            type: string
            inputBinding:
              prefix: -D
outputs:
  example_out:
    type: stdout
stdout: output.txt
baseCommand: echo
record-job1.yml#
dependent_parameters:
  itemA: one
exclusive_parameters:
  itemC: three
$ cwltool record.cwl record-job1.yml
INFO /opt/hostedtoolcache/Python/3.9.19/x64/bin/cwltool 3.1.20240508115724
INFO Resolved 'record.cwl' to 'file:///home/runner/work/user_guide/user_guide/src/_includes/cwl/inputs/record.cwl'
ERROR Workflow error, try again with --debug for more information:
Invalid job input record:
record-job1.yml:1:1: the 'dependent_parameters' field is not valid because
                       missing required field 'itemB'

In the first example, you can’t provide itemA without also providing itemB.

record-job2.yml#
dependent_parameters:
  itemA: one
  itemB: two
exclusive_parameters:
  itemC: three
  itemD: four
$ cwltool record.cwl record-job2.yml
INFO /opt/hostedtoolcache/Python/3.9.19/x64/bin/cwltool 3.1.20240508115724
INFO Resolved 'record.cwl' to 'file:///home/runner/work/user_guide/user_guide/src/_includes/cwl/inputs/record.cwl'
record-job2.yml:6:3: Warning: invalid field 'itemD', expected one of: 'itemC'
WARNING record-job2.yml:6:3: Warning: invalid field 'itemD', expected one of: 'itemC'
INFO [job record.cwl] /tmp/ep2j6224$ echo \
    -A \
    one \
    -B \
    two \
    -C \
    three > /tmp/ep2j6224/output.txt
INFO [job record.cwl] completed success
{
    "example_out": {
        "location": "file:///home/runner/work/user_guide/user_guide/src/_includes/cwl/inputs/output.txt",
        "basename": "output.txt",
        "class": "File",
        "checksum": "sha1$329fe3b598fed0dfd40f511522eaf386edb2d077",
        "size": 23,
        "path": "/home/runner/work/user_guide/user_guide/src/_includes/cwl/inputs/output.txt"
    }
}INFO Final process status is success
$ cat output.txt
-A one -B two -C three

In the second example, itemC and itemD are exclusive, so only the first matching item (itemC) is added to the command line and remaining item (itemD) is ignored.

record-job3.yml#
dependent_parameters:
  itemA: one
  itemB: two
exclusive_parameters:
  itemD: four
$ cwltool record.cwl record-job3.yml
INFO /opt/hostedtoolcache/Python/3.9.19/x64/bin/cwltool 3.1.20240508115724
INFO Resolved 'record.cwl' to 'file:///home/runner/work/user_guide/user_guide/src/_includes/cwl/inputs/record.cwl'
INFO [job record.cwl] /tmp/59j0nt0e$ echo \
    -A \
    one \
    -B \
    two \
    -D \
    four > /tmp/59j0nt0e/output.txt
INFO [job record.cwl] completed success
{
    "example_out": {
        "location": "file:///home/runner/work/user_guide/user_guide/src/_includes/cwl/inputs/output.txt",
        "basename": "output.txt",
        "class": "File",
        "checksum": "sha1$77f572b28e441240a5e30eb14f1d300bcc13a3b4",
        "size": 22,
        "path": "/home/runner/work/user_guide/user_guide/src/_includes/cwl/inputs/output.txt"
    }
}INFO Final process status is success
$ cat output.txt
-A one -B two -D four

In the third example, only itemD is provided, so it appears on the command line.

2.4.3.1. Exclusive Input Parameters with Expressions#

If you use exclusive input parameters and reference them in expressions, you need to be aware that the inputs JavaScript object will contain one of the possible, mutually-exclusive input values. Because the types of these exclusive values may differ, you may need to check which type is in use when you reference the properties of the input object.

Let’s use an example that contains an exclusive file_format input parameter that accepts null (i.e. no value provided), or any value from an enum.

exclusive-parameter-expressions.cwl#
cwlVersion: v1.2
class: CommandLineTool

inputs:
  file_format:
    type:
      - 'null'
      - name: format_choices
        type: enum
        symbols:
          - auto
          - fasta
          - fastq
          - fasta.gz
          - fastq.gz
        inputBinding:
          position: 0
          prefix: '--format'
outputs:
  text_output:
    type: string
    outputBinding:
      outputEval: $(inputs.file_format)

baseCommand: 'true'

Note how the JavaScript expression uses the value of the exclusive input parameter without taking into consideration a null value. If you provide a valid value, such as fasta (one of the possible values of the enum), your command should execute successfully:

$ cwltool exclusive-parameter-expressions.cwl --file_format fasta
INFO /opt/hostedtoolcache/Python/3.9.19/x64/bin/cwltool 3.1.20240508115724
INFO Resolved 'exclusive-parameter-expressions.cwl' to 'file:///home/runner/work/user_guide/user_guide/src/_includes/cwl/inputs/exclusive-parameter-expressions.cwl'
INFO [job exclusive-parameter-expressions.cwl] /tmp/q2jkyter$ true \
    --format \
    fasta
INFO [job exclusive-parameter-expressions.cwl] completed success
{
    "text_output": "fasta"
}INFO Final process status is success

However, if you do not provide any input value, then file_format will be evaluated to null, which does not match the expected type for the output field (a string), resulting in failure when running your workflow.

$ cwltool exclusive-parameter-expressions.cwl
INFO /opt/hostedtoolcache/Python/3.9.19/x64/bin/cwltool 3.1.20240508115724
INFO Resolved 'exclusive-parameter-expressions.cwl' to 'file:///home/runner/work/user_guide/user_guide/src/_includes/cwl/inputs/exclusive-parameter-expressions.cwl'
INFO [job exclusive-parameter-expressions.cwl] /tmp/qkd8lw16$ true
ERROR [job exclusive-parameter-expressions.cwl] Job error:
Error validating output record. the 'text_output' field is not valid because
  the value is not string
 in {
    "text_output": null
}
WARNING [job exclusive-parameter-expressions.cwl] completed permanentFail
{}WARNING Final process status is permanentFail

To correct it, you should explicitly handle the possibility of a null value. For example, the expression could be changed to $(inputs.file_format || 'auto'), to have a default value "auto" if none was provided in the command line or job input file.

Here, the boolean “or” operator || in JavaScript is used for its short-circuiting property. If inputs.file_format is “true” in a boolean context (e.g. a valid non-empty string from the enum), the evaluation of the expression stops at the first operand of ||; it “short-circuits”. If however inputs.file_format is null, the whole expression’s value becomes that of the second operand, which is why a reasonable default can be provided there.