Finding Patterns Using the Smith-Waterman Algorithm
Task Name: find-sw
Searches for a pattern in a nucleotide or protein sequence using the Smith-Waterman algorithm and saves the regions found as annotations.
Parameters:
in — Input sequence file. [String, Required]
out — Output file with the annotations. [String, Required]
name — Name of the annotated regions. [String, Optional, Default: “misc_feature”]
ptrn — Subsequence pattern to search for (e.g., AGGCCT). [String, Required]
score — Percent identity between the pattern and a subsequence. [Number, Optional, Default: 90]
matrix — Scoring matrix. [String, Optional, Default: “Auto”]
Among others, the following values are available:
- blosum62
- dna
- rna
- dayhoff
- gonnet
- pam250
- etc.
The available matrices are stored in the $UGENE\data\weight_matrix directory.
filter — Results filtering strategy. [String, Optional, Default: “filter-intersections”]
The following values are available:
- filter-intersections
- none
Example:
ugene find-sw –in=human_T1.fa –out=sw.gb –ptrn=TGCT –filter=none