Build HMM from Alignment and Test It

This workflow builds a new profile HMM from input alignment, calibrates the HMM, and saves it to a file. It then runs a test HMM search over a sample sequence and saves the test results to a Genbank file.

How to Use This Sample

If you haven’t used the workflow samples in UGENE before, please refer to the “How to Use Sample Workflows” section of the documentation.

Workflow Sample Location

The workflow sample “Build HMM from Alignment and Test It” can be found in the “HMMER” section of the Workflow Designer samples.

Workflow Image

The workflow appears as follows:

Workflow Wizard

The wizard has 4 pages.

1. Input MSA(s)

On this page, you must input MSA(s).

2. Input Sequence(s)

On this page, you must input sequences.

3. HMM Build

On this page, you can modify HMM build parameters.

The following parameters are available:

ParameterDescription
Output HMM profileLocation of the output data file. If this attribute is set, the slot “Location” in port will not be used.
HMM strategySpecifies the kind of alignments you want to allow.
Profile nameDescriptive name of the HMM profile.
Calibrate profileEnables/disables optional profile calibration. Calibration increases the sensitivity of the database search.
Parallel calibrationNumber of parallel threads for calibration.
Fixed lengthFixes the length of generated random sequences. The default is to use various lengths from a Gaussian distribution.
Mean lengthMean length of synthetic sequences. Default: 325.
Number of samplesNumber of synthetic sequences. Default: 5000. Too few may fail EVD fitting; more improves accuracy.
Standard deviationStandard deviation of sequence lengths. Default: 200. Gaussian is left-truncated (no zero-lengths).
Random seedRandom seed. Use a positive integer for reproducible results. By default, the current time is used, so results vary between runs.

On this page, you can modify HMM search and output parameters.

The following parameters are available:

ParameterDescription
Output GenbankLocation of the output data file. If set, the port “Location” is not used.
Accumulate resultsCombine all results into one file or create separate files (with numeric suffixes).
Result annotationName of the result annotations.
Number of seqsCalculate E-values as if the sequence DB had this many entries.
Filter by high E-valueUse an E-value threshold to exclude low-probability hits.
Filter by low scoreAn alternative to E-value filtering — excludes low-score hits.