Introduction to tool wrapping
About tool wrapping
The process of describing a command line tool in CWL so that it can be run as an app or used in a workflow with other tools and apps is known as wrapping the tool. The Rabix Composer tool editor is used to wrap tools.
There are two parts to wrapping a tool:
- creating a Docker image containing the tool
- creating a CWL definition describing the tool.
The Docker image contains the command line tool and any supporting files it needs in order to run (for example, fixed configuration files).
The CWL definition specifies the location of the Docker image, and defines the way in which the the inputs and outputs exposed to the user of the tool are mapped to the underlying inputs, outputs and other options required by the command line tool.
A CWL tool definition may expose all the parameters and options available for a command line tool, but often only some parameters and options are exposed, and other more technical parameters and options are set to fixed values.
When the tool is run in a workflow, it runs inside a Docker container created from the Docker image containing the tool and its dependencies. The execution process passes the input data to the Docker container, and retrieves the results from the container. The command line that is used to execute the tool inside the Docker container is constructed from the inputs and arguments that were specified when the tool was wrapped, and the values that have been supplied for those inputs.
Tool wrapping components
There are several key components that need to be specified when wrapping a tool. Here’s a brief description of each. This tool editor tutorial looks at each component in more detail and shows how they could be defined when wrapping a real command line tool using Rabix Composer.
- The Docker image contains the command line tool and supporting files if required. The Docker image needs to be saved in an image repository, usually either the Seven Bridges Image Registry or Docker’s own image repository on Docker Hub.
- The base command is the fixed part of the command that invokes the tool, before any options or parameters. The arguments, despite the name, are any parts of the command line after the base command which you want to be set to fixed values when the tool executes. Arguments can be specified by a prefix or a position on the command line, or both. If an argument can be positioned immediately after the base command, then optionally, you could include it in the base command section instead.
- The input ports define the variable data that you want to supply to the tool when it executes. Many input ports will be files, but input ports can also have other data types, like strings, integers or structures. When a tool is placed in a workflow, an input port can either be connected to an output port of another tool, or can be used to define an input to the workflow.
- The output ports define the data that the tool creates when it executes. Many outputs will be files, but output ports can also have other data types, like strings, integers or structures. When a tool is placed in a workflow, an output port can either be connected to an input port of another tool, or can be used to define an output to the workflow.
About dynamic expressions
Dynamic expressions are an important and useful concept in tool wrapping. They allow you to express the value of one parameter or option in terms of another parameter, option, or other aspect of the tool’s execution. For example, you may want to specify an output file name that is based on the name of an input file, perhaps with a suffix or a different file extension. Dynamic expressions can be used in arguments, input ports and output ports, as well as in various other places in the Tool Editor.