Problem with solexa read assembly - (Feb/20/2011 )
hi,
i am trying to sequence the bacterial genome using solexa reads...From solexa i got 8,053,769 reads.
I use GENEIOUS software to assemble them..when i tried to de-novo assemble it, it gave me 381 contigs....
However, i am not sure how do i proceed with this from here...
since i am ameture in this field, i am not sure what i need to do..
please help
i'm no specialist at bioinformatics at all ...but as far as i know you can have a hard time on de-novo assembly using Illumnia reads.
It very much depends on the technology you have used ...paired-end or not and what coverage do you have ...whether this will be successful or not. But if there is no reference genome i would prefere using 454 technology to create a scaffold first. Since the reads with Illumnia are short you will have a hard time with all that repetitive elements in your genome (tRNAs, IS-elements, REP-sequences).
I don't know what assembler the Geneious software suit uses but the choice of the assembler has also great influence on the result
Maybe you can get in conntact with some bioinformatics that will help you on that issue ...since finishing a genome is not that easy and a waste of time and money if it has not been done correctly.
If you really have 800 contigs this would mean you'll have a lot of gaps to close by sanger sequencing!
Try to get some help from specialists!
Regards,
p
roshanbernard on Mon Feb 21 01:04:10 2011 said:
hi,
i am trying to sequence the bacterial genome using solexa reads...From solexa i got 8,053,769 reads.
I use GENEIOUS software to assemble them..when i tried to de-novo assemble it, it gave me 381 contigs....
However, i am not sure how do i proceed with this from here...
since i am ameture in this field, i am not sure what i need to do..
please help
I'm not familiar with the Illumina either, but we are starting on a 454jr. The rep who showed us how to do data analysis recommended we randomly select less reads (especially when you have as many as 8 millions). She found the amount of samples that was best to give a single contig for this particular sample. Unfortunately, she said that there was no magic number, you need to play around with the number of reads you use. I'd try with 1 million first see if the number of contigs decreases and then go from there. But again, this was advice for 454. Not sure it applies to Solexa.
Thank you for the advise....as u guys suggested me i will just try trimming and using different assemble and see what will be the outcome...