Tutorial: How to Add Scripts To Workflows in UGENE

Scripts in Workflow

We are about to find out how to use scripts for defining the workflow software processes parameters values.

UGENE Workflow Designer allows you to create and run complex computational workflows. The scripting mechanism is aimed to provide additional flexibility for defining the workers (or processes) behavior and the workers parameters values in the workflows.

Let's load a simple sample scheme, the conversion of sequences into Genbank format sequences. It consists of two workers and the dataflow between them. The sequence reader object reads sequences from input files specified with the „Input file“ parameter. I will specify it with these two files of PDB and CLUSTAL formats and select the „one iteration“ mode to process each file identically.

Now I will specify the „Write Genbank“ worker output files names. But first let's set the „Scripting mode“ toolbar item value to „Show scripting options“. It will allow us to use scripts to define certain UGENE workflow software workers parameters.

Now the „Write Genbank“ parameters table is extended with a third column, „Scripts“. We can see that scripting is available for „Output file“ and „Existing file“ parameters. I click on the pulldown and select „user script“ item. This brings out the Script editor dialog. The variables, that are declared in the upper text area, represent input and output data. 

For the „Output file“ parameter we have the input file location (URL), the annotations table (if any) and the sequence. The output file location is defined with the URL_OUT variable value.

Without using a script, the output files locations are based on the entered path. But with a script one can define the output file locations using the input file location as basis. For instance, let's open this script. 

It declares an auxiliary variable fileName containing the input file location represented by a String object. The next line will cut that part from the string that starts with a dot (which will be the file extension if the file path or name contain no dots). Third line will add the „underline copied“ word to the end of the remained string. And the last line will add the new extension to the string and assign this new value to the output file location variable. Let's see this script working! I press „Done“, and then I press „Run schema“ to run the "PDB converter". Our "PDB converter" has finished the conversion of sequences from the PDB format and the ALN format files to Genbank files. As we can see, the whole file path and name of the input file were retained, and the extension were changed. The same is true for the PDB format file as well. That's how you can use scripting mechanism to define workers parameter's values in UGENE Workflow Designer.

Additional Materials

Documentation page

Youtube video