2.4. Inputs#
2.4.1. Essential Input Parameters#
The inputs
of a tool is a list of input parameters that control how to
run the tool. Each parameter has an id
for the name of parameter, and
type
describing what types of values are valid for that parameter.
Available primitive types are string, boolean, int, long, float, double, and null; complex types are array and record; in addition there are special types File, Directory and Any.
The following example demonstrates some input parameters with different types and appearing on the command line in different ways.
First, create a file called inp.cwl
, containing the following:
#!/usr/bin/env cwl-runner
cwlVersion: v1.2
class: CommandLineTool
baseCommand: echo
inputs:
example_flag:
type: boolean
inputBinding:
position: 1
prefix: -f
example_string:
type: string
inputBinding:
position: 3
prefix: --example-string
example_int:
type: int
inputBinding:
position: 2
prefix: -i
separate: false
example_file:
type: File?
inputBinding:
prefix: --file=
separate: false
position: 4
outputs: []
Create a file called inp-job.yml
:
example_flag: true
example_string: hello
example_int: 42
example_file:
class: File
path: whale.txt
Note
You can use cwltool
to create a template input object. That saves you from having
to type all the input parameters in an input object file:
$ cwltool --make-template inp.cwl
INFO /opt/hostedtoolcache/Python/3.9.19/x64/bin/cwltool 3.1.20240508115724
INFO Resolved 'inp.cwl' to 'file:///home/runner/work/user_guide/user_guide/src/_includes/cwl/inputs/inp.cwl'
example_string: a_string # type 'string'
example_int: 0 # type 'int'
example_flag: false # type 'boolean'
example_file: # type 'File' (optional)
class: File
path: a/file/path
You can redirect the output to a file, i.e. cwltool --make-template inp.cwl > inp-job.yml
,
and then modify the default values with your desired input values.
Notice that “example_file”, as a File
type, must be provided as an
object with the fields class: File
and path
.
Next, create a whale.txt using touch by typing touch whale.txt
on the command line.
$ touch whale.txt
Now invoke cwltool
with the tool description and the input object on the command line,
using the command cwltool inp.cwl inp-job.yml
. The following boxed text describes these
two commands and the expected output from the command line:
$ cwltool inp.cwl inp-job.yml
INFO /opt/hostedtoolcache/Python/3.9.19/x64/bin/cwltool 3.1.20240508115724
INFO Resolved 'inp.cwl' to 'file:///home/runner/work/user_guide/user_guide/src/_includes/cwl/inputs/inp.cwl'
INFO [job inp.cwl] /tmp/co7p_ml8$ echo \
-f \
-i42 \
--example-string \
hello \
--file=/tmp/u_wygi4r/stg2445be6b-7db0-46aa-af4a-af85f3f8ba4d/whale.txt
-f -i42 --example-string hello --file=/tmp/u_wygi4r/stg2445be6b-7db0-46aa-af4a-af85f3f8ba4d/whale.txt
INFO [job inp.cwl] completed success
{}INFO Final process status is success
Tip
Where did those `/tmp` paths come from?
The CWL reference runner (cwltool) and other runners create temporary directories with symbolic (“soft”) links to your input files to ensure that the tools aren’t accidentally accessing files that were not explicitly specified
The field inputBinding
is optional and indicates whether and how the
input parameter should appear on the tool’s command line. If
inputBinding
is missing, the parameter does not appear on the command
line. Let’s look at each example in detail.
example_flag:
type: boolean
inputBinding:
position: 1
prefix: -f
Boolean types are treated as a flag. If the input parameter
“example_flag” is “true”, then prefix
will be added to the
command line. If false, no flag is added.
example_string:
type: string
inputBinding:
position: 3
prefix: --example-string
String types appear on the command line as literal values. The prefix
is optional, if provided, it appears as a separate argument on the
command line before the parameter . In the example above, this is
rendered as --example-string hello
.
example_int:
type: int
inputBinding:
position: 2
prefix: -i
separate: false
Integer (and floating point) types appear on the command line with
decimal text representation. When the option separate
is false (the
default value is true), the prefix and value are combined into a single
argument. In the example above, this is rendered as -i42
.
example_file:
type: File?
inputBinding:
prefix: --file=
separate: false
position: 4
File types appear on the command line as the path to the file. When the
parameter type ends with a question mark ?
it indicates that the
parameter is optional. In the example above, this is rendered as
--file=/tmp/random/path/whale.txt
. However, if the “example_file”
parameter were not provided in the input, nothing would appear on the
command line.
Input files are read-only. If you wish to update an input file, you must first copy it to the output directory.
The value of position
is used to determine where parameter should
appear on the command line. Positions are relative to one another, not
absolute. As a result, positions do not have to be sequential, three
parameters with positions 1, 3, 5 will result in the same command
line as 1, 2, 3. More than one parameter can have the same position
(ties are broken using the parameter name), and the position field itself
is optional. The default position is 0.
The baseCommand
field will always appear in the final command line before the parameters.
2.4.2. Array Inputs#
It is easy to add arrays of input parameters represented to the command
line. There are two ways to specify an array parameter. First is to provide
type
field with type: array
and items
defining the valid data types
that may appear in the array. Alternatively, brackets []
may be added after
the type name to indicate that input parameter is array of that type.
#!/usr/bin/env cwl-runner
cwlVersion: v1.2
class: CommandLineTool
inputs:
filesA:
type: string[]
inputBinding:
prefix: -A
position: 1
filesB:
type:
type: array
items: string
inputBinding:
prefix: -B=
separate: false
inputBinding:
position: 2
filesC:
type: string[]
inputBinding:
prefix: -C=
itemSeparator: ","
separate: false
position: 4
outputs:
example_out:
type: stdout
stdout: output.txt
baseCommand: echo
filesA: [one, two, three]
filesB: [four, five, six]
filesC: [seven, eight, nine]
Now invoke cwltool
providing the tool description and the input object
on the command line:
$ cwltool array-inputs.cwl array-inputs-job.yml
INFO /opt/hostedtoolcache/Python/3.9.19/x64/bin/cwltool 3.1.20240508115724
INFO Resolved 'array-inputs.cwl' to 'file:///home/runner/work/user_guide/user_guide/src/_includes/cwl/inputs/array-inputs.cwl'
INFO [job array-inputs.cwl] /tmp/o5496i5q$ echo \
-A \
one \
two \
three \
-B=four \
-B=five \
-B=six \
-C=seven,eight,nine > /tmp/o5496i5q/output.txt
INFO [job array-inputs.cwl] completed success
{
"example_out": {
"location": "file:///home/runner/work/user_guide/user_guide/src/_includes/cwl/inputs/output.txt",
"basename": "output.txt",
"class": "File",
"checksum": "sha1$91038e29452bc77dcd21edef90a15075f3071540",
"size": 60,
"path": "/home/runner/work/user_guide/user_guide/src/_includes/cwl/inputs/output.txt"
}
}INFO Final process status is success
$ cat output.txt
-A one two three -B=four -B=five -B=six -C=seven,eight,nine
The inputBinding
can appear either on the outer array parameter definition
or the inner array element definition, and these produce different behavior when
constructing the command line, as shown above.
In addition, the itemSeparator
field, if provided, specifies that array
values should be concatenated into a single argument separated by the item
separator string.
Note that the arrays of inputs are specified inside square brackets []
in array-inputs-job.yml
. Arrays can also be expressed over multiple lines, where
array values that are not defined with an associated key are marked by a leading -
.
This will be demonstrated in the next lesson
and is discussed in more detail in the YAML Guide.
You can specify arrays of arrays, arrays of records, and other complex types.
2.4.3. Inclusive and Exclusive Inputs#
Sometimes an underlying tool has several arguments that must be provided together (they are dependent) or several arguments that cannot be provided together (they are exclusive). You can use records and type unions to group parameters together to describe these two conditions.
#!/usr/bin/env cwl-runner
cwlVersion: v1.2
class: CommandLineTool
inputs:
dependent_parameters:
type:
type: record
name: dependent_parameters
fields:
itemA:
type: string
inputBinding:
prefix: -A
itemB:
type: string
inputBinding:
prefix: -B
exclusive_parameters:
type:
- type: record
name: itemC
fields:
itemC:
type: string
inputBinding:
prefix: -C
- type: record
name: itemD
fields:
itemD:
type: string
inputBinding:
prefix: -D
outputs:
example_out:
type: stdout
stdout: output.txt
baseCommand: echo
dependent_parameters:
itemA: one
exclusive_parameters:
itemC: three
$ cwltool record.cwl record-job1.yml
INFO /opt/hostedtoolcache/Python/3.9.19/x64/bin/cwltool 3.1.20240508115724
INFO Resolved 'record.cwl' to 'file:///home/runner/work/user_guide/user_guide/src/_includes/cwl/inputs/record.cwl'
ERROR Workflow error, try again with --debug for more information:
Invalid job input record:
record-job1.yml:1:1: the 'dependent_parameters' field is not valid because
missing required field 'itemB'
In the first example, you can’t provide itemA
without also providing itemB
.
dependent_parameters:
itemA: one
itemB: two
exclusive_parameters:
itemC: three
itemD: four
$ cwltool record.cwl record-job2.yml
INFO /opt/hostedtoolcache/Python/3.9.19/x64/bin/cwltool 3.1.20240508115724
INFO Resolved 'record.cwl' to 'file:///home/runner/work/user_guide/user_guide/src/_includes/cwl/inputs/record.cwl'
record-job2.yml:6:3: Warning: invalid field 'itemD', expected one of: 'itemC'
WARNING record-job2.yml:6:3: Warning: invalid field 'itemD', expected one of: 'itemC'
INFO [job record.cwl] /tmp/ep2j6224$ echo \
-A \
one \
-B \
two \
-C \
three > /tmp/ep2j6224/output.txt
INFO [job record.cwl] completed success
{
"example_out": {
"location": "file:///home/runner/work/user_guide/user_guide/src/_includes/cwl/inputs/output.txt",
"basename": "output.txt",
"class": "File",
"checksum": "sha1$329fe3b598fed0dfd40f511522eaf386edb2d077",
"size": 23,
"path": "/home/runner/work/user_guide/user_guide/src/_includes/cwl/inputs/output.txt"
}
}INFO Final process status is success
$ cat output.txt
-A one -B two -C three
In the second example, itemC
and itemD
are exclusive, so only the first
matching item (itemC
) is added to the command line and remaining item (itemD
) is ignored.
dependent_parameters:
itemA: one
itemB: two
exclusive_parameters:
itemD: four
$ cwltool record.cwl record-job3.yml
INFO /opt/hostedtoolcache/Python/3.9.19/x64/bin/cwltool 3.1.20240508115724
INFO Resolved 'record.cwl' to 'file:///home/runner/work/user_guide/user_guide/src/_includes/cwl/inputs/record.cwl'
INFO [job record.cwl] /tmp/59j0nt0e$ echo \
-A \
one \
-B \
two \
-D \
four > /tmp/59j0nt0e/output.txt
INFO [job record.cwl] completed success
{
"example_out": {
"location": "file:///home/runner/work/user_guide/user_guide/src/_includes/cwl/inputs/output.txt",
"basename": "output.txt",
"class": "File",
"checksum": "sha1$77f572b28e441240a5e30eb14f1d300bcc13a3b4",
"size": 22,
"path": "/home/runner/work/user_guide/user_guide/src/_includes/cwl/inputs/output.txt"
}
}INFO Final process status is success
$ cat output.txt
-A one -B two -D four
In the third example, only itemD
is provided, so it appears on the
command line.
2.4.3.1. Exclusive Input Parameters with Expressions#
If you use exclusive input parameters and reference them in expressions, you
need to be aware that the inputs
JavaScript object will contain one of the
possible, mutually-exclusive input values. Because the types of these exclusive
values may differ, you may need to check which type is in use when you
reference the properties of the input
object.
Let’s use an example that contains an exclusive file_format
input parameter
that accepts null
(i.e. no value provided), or any value from an enum.
cwlVersion: v1.2
class: CommandLineTool
inputs:
file_format:
type:
- 'null'
- name: format_choices
type: enum
symbols:
- auto
- fasta
- fastq
- fasta.gz
- fastq.gz
inputBinding:
position: 0
prefix: '--format'
outputs:
text_output:
type: string
outputBinding:
outputEval: $(inputs.file_format)
baseCommand: 'true'
Note how the JavaScript expression uses the value of the exclusive input
parameter without taking into consideration a null
value. If you provide a
valid value, such as fasta
(one of the possible values of the enum), your
command should execute successfully:
$ cwltool exclusive-parameter-expressions.cwl --file_format fasta
INFO /opt/hostedtoolcache/Python/3.9.19/x64/bin/cwltool 3.1.20240508115724
INFO Resolved 'exclusive-parameter-expressions.cwl' to 'file:///home/runner/work/user_guide/user_guide/src/_includes/cwl/inputs/exclusive-parameter-expressions.cwl'
INFO [job exclusive-parameter-expressions.cwl] /tmp/q2jkyter$ true \
--format \
fasta
INFO [job exclusive-parameter-expressions.cwl] completed success
{
"text_output": "fasta"
}INFO Final process status is success
However, if you do not provide any input value, then file_format
will be
evaluated to null
, which does not match the expected type for the
output field (a string
), resulting in failure when running your workflow.
$ cwltool exclusive-parameter-expressions.cwl
INFO /opt/hostedtoolcache/Python/3.9.19/x64/bin/cwltool 3.1.20240508115724
INFO Resolved 'exclusive-parameter-expressions.cwl' to 'file:///home/runner/work/user_guide/user_guide/src/_includes/cwl/inputs/exclusive-parameter-expressions.cwl'
INFO [job exclusive-parameter-expressions.cwl] /tmp/qkd8lw16$ true
ERROR [job exclusive-parameter-expressions.cwl] Job error:
Error validating output record. the 'text_output' field is not valid because
the value is not string
in {
"text_output": null
}
WARNING [job exclusive-parameter-expressions.cwl] completed permanentFail
{}WARNING Final process status is permanentFail
To correct it, you should explicitly handle the possibility of a null
value.
For example, the expression could be changed to $(inputs.file_format || 'auto')
, to have a default value "auto"
if none was provided in the command
line or job input file.
Here, the boolean “or” operator ||
in JavaScript is used for its
short-circuiting property. If inputs.file_format
is “true” in a boolean
context (e.g. a valid non-empty string from the enum), the evaluation of the
expression stops at the first operand of ||
; it “short-circuits”. If however
inputs.file_format
is null
, the whole expression’s value becomes that of
the second operand, which is why a reasonable default can be provided there.