Build HMM from Alignment and Test It
This workflow builds a new profile HMM from input alignment, calibrates the HMM, and saves it to a file. It then runs a test HMM search over a sample sequence and saves the test results to a Genbank file.
How to Use This Sample
If you haven’t used the workflow samples in UGENE before, please refer to the “How to Use Sample Workflows” section of the documentation.
Workflow Sample Location
The workflow sample “Build HMM from Alignment and Test It” can be found in the “HMMER” section of the Workflow Designer samples.
Workflow Image
The workflow appears as follows:
Workflow Wizard
The wizard has 4 pages.
1. Input MSA(s)
On this page, you must input MSA(s).
2. Input Sequence(s)
On this page, you must input sequences.
3. HMM Build
On this page, you can modify HMM build parameters.
The following parameters are available:
Parameter | Description |
---|---|
Output HMM profile | Location of the output data file. If this attribute is set, the slot “Location” in port will not be used. |
HMM strategy | Specifies the kind of alignments you want to allow. |
Profile name | Descriptive name of the HMM profile. |
Calibrate profile | Enables/disables optional profile calibration. Calibration increases the sensitivity of the database search. |
Parallel calibration | Number of parallel threads for calibration. |
Fixed length | Fixes the length of generated random sequences. The default is to use various lengths from a Gaussian distribution. |
Mean length | Mean length of synthetic sequences. Default: 325. |
Number of samples | Number of synthetic sequences. Default: 5000. Too few may fail EVD fitting; more improves accuracy. |
Standard deviation | Standard deviation of sequence lengths. Default: 200. Gaussian is left-truncated (no zero-lengths). |
Random seed | Random seed. Use a positive integer for reproducible results. By default, the current time is used, so results vary between runs. |
4. HMM Search
On this page, you can modify HMM search and output parameters.
The following parameters are available:
Parameter | Description |
---|---|
Output Genbank | Location of the output data file. If set, the port “Location” is not used. |
Accumulate results | Combine all results into one file or create separate files (with numeric suffixes). |
Result annotation | Name of the result annotations. |
Number of seqs | Calculate E-values as if the sequence DB had this many entries. |
Filter by high E-value | Use an E-value threshold to exclude low-probability hits. |
Filter by low score | An alternative to E-value filtering — excludes low-score hits. |