Call Variants with SAMtools

Variant calling in UGENE can be performed using the SAMtools mpileup and bcftools view utilities. For more information about SAMtools and its utilities, visit the SAMTools homepage. Both utilities are embedded into UGENE, so no additional configuration is necessary.

How to Use This Sample

If you haven’t used the workflow samples in UGENE before, please refer to the “How to Use Sample Workflows” section of the documentation.

Workflow Sample Location

The workflow sample “Call Variants with SAMtools” is located in the “NGS” section of the Workflow Designer samples.

Workflow Image

Workflow Wizard

The wizard consists of 5 pages.


1. Input Reference Sequence and Assembly

Provide a file with a reference sequence and a sorted BAM or SAM file.


2. SAMtools mpileup Parameters

ParameterDescription
Count anomalous read pairsDo not skip anomalous read pairs in variant calling.
Disable BAQ computationDisable probabilistic realignment for base alignment quality (BAQ). Helps reduce false SNPs.
Mapping quality downgradeCoefficient for downgrading mapping quality due to excessive mismatches. Recommended: 50 for BWA.
Max reads per BAMMaximum number of reads considered at a position.
Extended BAQ computationImprove sensitivity for MNPs, may reduce specificity.
BED or position list fileList of regions or sites to generate pileup.
Pileup regionGenerate pileup only for the specified region.
Minimum mapping qualityFilter alignments below this mapping quality.
Minimum base qualityIgnore bases below this quality score.
Illumina-1.3+ encodingAssume quality values are in Illumina 1.3+ format.
Gap extension errorPhred-scaled error probability. Lower values allow longer indels.
Homopolymer errors coefficientUsed to model indel sequencing errors in homopolymers.
No INDELsDisable INDEL calling.
Max INDEL depthSkip INDEL calling above this per-sample depth.
Gap open errorPhred-scaled error probability. Lower values increase indel calls.
List of platforms for indelsComma-separated list of sequencing platforms used for indel calling (e.g., ILLUMINA).

3. SAMtools bcftools view Parameters

ParameterDescription
Retain all alternativesKeep all alternate alleles at variant sites.
Indicate PLSpecify if PL is from older versions (e.g., r921).
No genotype informationDo not output per-sample genotype data.
A/C/G/T onlySkip variants where REF is not A, C, G, or T.
List of sitesOutput information only for these specific sites.
QCALL likelihoodOutput QCALL likelihood format.
List of samplesUse a file to specify samples and ploidy. Ploidy can be 1 or 2.
Min samples fractionSkip loci with coverage in fewer than specified fraction of samples.
Per-sample genotypesEnable per-sample genotype calling.
INDEL-to-SNP ratioRatio of INDEL-to-SNP mutation rate.
Gap open errorError probability for opening gaps (affects indels).
**Max P(refD)**
Pair/trio callingEnable family-based variant calling. Requires trio configuration (-s).
N group-1 samplesNumber of samples in group 1 for contrast SNP calling. Outputs: PC2, PCHI2, QCHI2.
N permutationsNumber of permutations for association test.
Max P(chi²)Perform permutations only for loci with P(chi²) below this threshold.

4. SAMtools vcfutils varFilter Parameters

ParameterDescription
Log filteredPrint filtered variants into the log.
Minimum RMS qualityMinimum root-mean-square mapping quality.
Minimum read depthMinimum read depth.
Maximum read depthMaximum read depth.
Alternate basesMinimum number of alternate bases.
Gap sizeFilter SNPs near gaps within given distance.
Window sizeWindow size for adjacent gap filtering.
Strand biasMinimum P-value for strand bias.
BaseQ biasMinimum P-value for base quality bias.
MapQ biasMinimum P-value for mapping quality bias.
End distance biasMinimum P-value for distance-to-end bias.
HWEMinimum P-value for Hardy-Weinberg equilibrium (plus F<0).

5. Output Variations

Configure how the output is written.


The work on this pipeline was supported by grant RUB1-31097-NO-12 from NIAID.