Tutorial: Shared Databases in UGENE


Shared Database in UGENE

Do you need to share data, such as sequences, with your colleagues in a lab? How to create a plasmid database, gene database or a genome sequence database?

From this tutorial you will learn how you can easily do it with UGENE. You can easily share sequences, annotated sequence, multiple alignments etc. It will create a storage that accessible only buy UGENE. With security provided by MySQL.

Architecture

One of the advantages of the UGENE approach to the shared storage implementation is that it is easy-to-install. All you need for using the shared bioinformatics storage is to have a MySQL server installed and minimally configured.

After that, you run UGENE, connect to the server and initialize your storage with several mouse clicks making it ready to use. The storage is displayed in the project in the same manner as local documents.

You can organize the data in your storage using the folders approach. Create the folders hierarchy for ordering your data and import new files there.

Tools

There are flexible graphical tools for importing data into your genome database. For example, you can import a folder with all its gene files into a gene database.

The folders hierarchy will be repeated in the storage in this case.

The folders and imported objects can always be renamed, removed or replaced using drag’n’drop.

If you want to work with the data locally, you can export the needed objects to a local drive. The data is stored in internal UGENE format. As always you can eventually select the output format of your data among ~20 formats that UGENE supports. 

It is possible to use the storage for keeping data of all types supported by UGENE.

Moreover, when UGENE remotely accesses big data, such as next-generation sequencing data, it loads only required part of the data. So you can access huge genome assemblies with UGENE Assembly Browser and view them in a shared way. The Assembly Browser will load only a part of the assembly in focus from the shared storage.

This opens another interesting application of the shared storage.

If imported NGS data are stored on a common server in a lab, researchers from the lab are able to work with the data, for example browse them in the UGENE Assembly Browser, without the need to copy the data to their computers or laptops. Because of the special format used to store NGS data in UGENE, it is possible to see the NGS data coverage and quickly navigate to the most covered regions without full downloading, export the consensus sequence, and use other features of the assembly browser.

Public Storage

Based on the technology described in this episode we organized a public storage with commonly used data. The storage is opened for read-only access only. For now it keeps DNA sequences in the genome sequence database of several popular genomes such as human, mouse, drosophila melanogaster, etc. and hundreds of plasmid sequences in a plasmid database.

You can use this data from a UGENE instance. Use the connection manager dialog to connect to the storage.

Additional Materials

Documentation page

Youtube video