UGENE home page

The workflow outputs two files:

  • lib3.R1_001_Kraken_classification.txt: output by Kraken contains classification of with each NGS read,
  • lib3.R1_001_Kraken_report.txt: a report file, generated by UGENE, with general statistics per each taxID.

The second file is a tab-delimited text file. Open it in a Excel-like application. For each taxID the file contains, in particular, the following information:

  • tax_name: scientific name, associated with the taxID;
  • directly_num: number of NGS reads, directly assigned to the taxID;
  • clade_num: number of NGS reads, assigned to this taxID or one of the children in the taxonomy tree (e.g. «clade_num» for «tax_id = 2» will include the number of all bacterial reads).

Exercises

Exercise: Repeat taxonomical classification workflow with lib5 data. Compare the results.

Exercise: Classify reads with the default workflow using SPAdes and Kraken. Compare the results

What’s next?

MiniKraken is a rather small pre-build database. You can build a custom database with «Build Kraken Database» element and use this database instead of MiniKraken.

You can also use the parallel and serial taxonomical classification workflows with additional classifiers CLARK, DIAMOND and WEVOTE. Note that this requires more disk space (full UGENE package with metagenomics data takes ~ 250Gb). More advanced computational resources (RAM etc.) are also needed.

Acknowledgement

This tutorial is an adapted version of the tutorial by Carla Mavian, prepared for metagenomics practical session on VEME 2018 (see here).

The metagenomics framework in UGENE was supported by the VIROGENESIS project.