Welcome, Guest. Please Login or Register
UGENE Bulletin Board
  Welcome to our forum.
  HomeHelpSearchLoginRegister  
 
 
Page Index Toggle Pages: 1
Adding reference human genome to Ugene 1.31 and detection of cutadapt.py (Read 267 times)
Oct 4th, 2018 at 1:42pm

Suresh Shettigar   Offline
YaBB Newbies

Posts: 3
*
 
1) I wanted to use UGENE for NGS studies in human exome samples , targeted panels. How do I add references human genome to map to references. I want to download the human reference genome and add to ugene. How do I do it.
Secondly which genome should I download. The reference genome from your site has multiple sequences, how do I give all sequences to (MAP to reference sequences) script.

2) I have used your UGENE NGS package for ubuntu 18.04.1 LTS. My system is RYzen3 2200G with 16GB RAM, opencl enabled.
Although cutadapt is present like all other plugins, it doesn't get identified by UGENE, it keeps on saying to show its path.

3) Is there a way to increase the RAM available for OPENCL (I get about 3.5GB usable) is it possible to increase it 7 GB.

Thanks
Suresh
 
IP Logged
 
Reply #1 - Oct 4th, 2018 at 4:00pm

Olga Golosova   Offline
YaBB Administrator

Posts: 255
*****
 
Hello Suresh,

1) You can download reference data, for example, from Ensembl. Here is the link to the latest release for homo sapience: ftp://ftp.ensembl.org/pub/release-94/fasta/homo_sapiens/dna/. Download the gz-archived FASTA sequence for a chromosome you investigate and unpack it.

If you use the "Raw DNA-Seq data processing" workflow, you input this FASTA file for BWA-MEM (on the "Mapping" tab of the wizard).

2) Do you have Illumina sequencing reads? In this case you can actually use Trimmomatic instead of Cutadapt.

Also, how did you install UGENE?

Please open the UGENE Application Settings (select "Settings > Preferences" in the UGENE main menu), open the "External Tools" page and find "cutadapt" and "python" tools. Are the paths to the tools set? Are there any warnings signs on the tools icons?

3) Do you mean RAM or OPENCL? If you mean RAM, then also in the Application Settings dialog please go to the "Resources" page and check, what "Tasks memory limit" value you have there (corresponds to RAM).
 
IP Logged
 
Reply #2 - Oct 8th, 2018 at 1:38pm

Suresh Shettigar   Offline
YaBB Newbies

Posts: 3
*
 
Hello Olga,

Thank you for your detailed explanation. I need a few more clarifications.

>1)You can download reference data, for example, from Ensembl...........

I have a FASTQ file from a panel of 6000 genes. Can we give the whole human genome as a fasta file. I have downloaded the whole latest human genome build with alt loci and patches. Our population is from INDIA and we have lot of variants which are different from reference genome and it would be helpful if we use the alt loci. Please advise if we do need to give it one chromosome at a time or whole genome can be given for mapping to reference. Or is it possible to automate the process of one chromosome at a time.

> 2) Do you have Illumina sequencing reads? In this case you can actually use Trimmomatic instead of Cutadapt.

Yes, we are having Illumina sequencing reads.

I have downloaded the NGS portable for linux and installed it.

>Please open the UGENE Application Settings (select "Settings > Preferences" in the UGENE main menu), open the "External Tools" page and find "cutadapt" and "python" tools. Are the paths to the tools set? Are there any warnings signs on the tools icons?

Except for cutadapt which has a triangle sign.

3) Do you mean RAM or OPENCL?
I meant OPENCL

Do you recommend or is it possible to use UGENE for complete NGS analysis for whole human exomes. There are no tutorials for whole exome NGS analysis in your video tutorials.
 
IP Logged
 
Reply #3 - Oct 8th, 2018 at 2:47pm

Olga Golosova   Offline
YaBB Administrator

Posts: 255
*****
 
Hello Suresh,

Quote:
I have a FASTQ file from a panel of 6000 genes. Can we give the whole human genome as a fasta file. I have downloaded the whole latest human genome build with alt loci and patches. Our population is from INDIA and we have lot of variants which are different from reference genome and it would be helpful if we use the alt loci. Please advise if we do need to give it one chromosome at a time or whole genome can be given for mapping to reference. Or is it possible to automate the process of one chromosome at a time.

You can give any nucleotide sequence as reference. However, it's better not to join chromosomes into one sequence - BWA-MEM will work faster in this case. I recommend you to specify one chromosome at a time (i.e. an unarchived FASTA file). By the way, see also this link.

Quote:
Yes, we are having Illumina sequencing reads.
I have downloaded the NGS portable for linux and installed it.
Except for cutadapt which has a triangle sign.

OK, you have two options: 1) use Trimmomatic instead of Cutadapt, 2) configure Cutadapt. If you like, I can help you to do that by Skype. Please write to ugene@unipro.ru, if you need help.

Quote:
I meant OPENCL

Only a couple of tools have video cards optmization: Smith-Waterman and UGENE Genome Aligner. If you don't use these tools, then OpenCL is not used.

Quote:
Do you recommend or is it possible to use UGENE for complete NGS analysis for whole human exomes. There are no tutorials for whole exome NGS analysis in your video tutorials.

This depends on your needs. In general, yes, it is possible.
Thank you for you feedback! We'll consider to create such tutorial in future.
 
IP Logged
 
Reply #4 - Oct 14th, 2018 at 2:56pm

Suresh Shettigar   Offline
YaBB Newbies

Posts: 3
*
 
Hello Olga,

Thanks for your help.
I am attaching the error files for cutadapt. If it will help you tell me where am I making the error.

I did a round about way for doing my analysis. I ran the map to references in a windows7 pc with 8GB ram and got "SN_R2.fastq.filtered.fastq.cutadapt.fastq.trimmed.fastq" which was to be mapped using BWA-MEM, but unfortunately my windows PC repeatedly crashed at this point.

So I took this file and ran it in the linux PC mentioned earlier. It worked perfectly, even when I gave the whole Human genome reference file as  .gz. I got the "Oct9.sam.bam.sorted.bam.merged.bam.filtered.bam.sorted.bam.nodup.bam.sorted.bam
" which I am passing to SnpEFF program. This too some issues, first the downloading the GRCH38.86 failed, which I am trying again as I am sending this message, this is continuing as of now.

Can I use annovar inside UGENE, is yes can you please tell me how to go about it. Or do you recommended only SNPEff.

regards,
Suresh
 

cutadapt_error.jpg (120 KB | 15 )
cutadapt_error.jpg
python_installed.jpg (109 KB | 14 )
python_installed.jpg
IP Logged
 
Reply #5 - Oct 15th, 2018 at 5:17pm

Olga Golosova   Offline
YaBB Administrator

Posts: 255
*****
 
Hello Suresh,

Quote:
I am attaching the error files for cutadapt. If it will help you tell me where am I making the error.

Cutadapt requires "python2" executable file. You specify system python 2.7, but probably there is no "python2" symlink. You have at least the following two options:
1) Create a symlink for system python. Run this command: "sudo ln -s /usr/bin/python2.7 /usr/bin/python2".
2) Use python from the UGENE external tools package (i.e. there is a python tool somewhere nearby cutadapt).

Quote:
but unfortunately my windows PC repeatedly crashed at this point

Could you please describe this in more details? Do you run "Raw DNA-Seq data processing" workflow? May I ask you to share some test data, so we can reproduce the issue?

Quote:
the downloading the GRCH38.86 failed

Do you have good network connection? The file is about 1.5 Gb. You could try a workaround. Download the following file using e.g. a web browser: https://datapacket.dl.sourceforge.net/project/snpeff/databases/v4_3/snpEff_v4_3_.... Unpack the archive. Put it into a subfolder of your home folder, so that it looks as follows: /home/your_name/.UGENE_downloaded/snpeff_data_4.3i/GRCh38.86/.

Quote:
Can I use annovar inside UGENE

No, Annovar is not integrated into UGENE.
 
IP Logged
 
Page Index Toggle Pages: 1