Protocol Online logo
Top : New Forum Archives (2009-): : Bioinformatics and Biostatistics

sequencing results - (Jun/18/2012 )

Perhaps a weird question,
but when I get my sequencing results back, the first bases and the last ones are always bad. Why is this?
The last ones, I guess it has to do with polymerases starting to work bad etc..
But how come the first bases are bad too?

-lyok-

it has to do with the cleaning method (according to beckman-coulter). with standard cleaning (precipitation) you can start reading from about 20 bases from the primer (there will still be some noise). with more stringent cleaning (spin column, eg qiagen) you can start reading about 5 bases from the primer.

also, as with any electrophoresis method, you get close early peaks and diffuse late peaks.

if you are using next gen sequencing then disregard this post.

-mdfenko-

I am not really sure what you mean with cleaning.
By cleaning, you simple mean cleaning the sample, removing components, like salts, that you dont want?

I find it hard to understand that the first bases would be bad and after a few bases it would become better? The "dirt" in the sample isnt removed, so how come it gets better all of a sudden?

I am not sure what you mean with this: as with any electrophoresis method, you get close early peaks and diffuse late peaks. How come you get early peaks?


And next gen sequencing? I dont sequence the samples myself, we ship them to a company , so I would find it weird they arent using next gen sequencing.


(by next gen sequencing, you do mean illumina etc?)

-lyok-

lyok on Mon Jun 18 19:38:54 2012 said:


I am not really sure what you mean with cleaning.
By cleaning, you simple mean cleaning the sample, removing components, like salts, that you dont want?

I find it hard to understand that the first bases would be bad and after a few bases it would become better? The "dirt" in the sample isnt removed, so how come it gets better all of a sudden?

I am not sure what you mean with this: as with any electrophoresis method, you get close early peaks and diffuse late peaks. How come you get early peaks?


And next gen sequencing? I dont sequence the samples myself, we ship them to a company , so I would find it weird they arent using next gen sequencing.


(by next gen sequencing, you do mean illumina etc?)

yes (regarding next gen sequencing). next gen is not sanger sequencing.

i work with a beckman-coulter ceq 8000. it works by separating sanger (dideoxy termination) sequencing samples by capillary electrophoresis (similar to abi sequencers). the early bases come through tightly packed and usually saturated (fluorescence), making it difficult to properly position the bases. the late bases come through as broader, more diffuse, peaks so you may get false base calls.

as for cleaning, after running the reactions, you clean up the sample and suspend the fluorescent dna fragments in formamide (water for abi, i think) for electrokinetic injection onto the capillaries.

-mdfenko-

A ha
I see what you mean. I wasnt really familiar with the proces and didnt link it with electrophoresis, which is not that familair too, but I see your point.
Its just because the first parts of the electrophoresis proces are not "clean" enough, still having more contamination etc that they dont respond well to the sequencing.
However: how come the fluorescence is more in the first bases ? Isnt the flluorescence staining done randomly or do you only stain the first bases?
And why are the first bases more packed?
Not sure there is an answer to that last question , perhaps its just a general effect?

I am not that familiar with electrophoresis so not aware of those general "rules".

-lyok-

I think the real problem is that the early fragments are very short, and the electrophoresis has a hard time resolving the difference in their lengths. The gel is optimized for resolution on longer fragments. I have not noticed that the resolution or peak intensity depends noticeably on the purity of the samples, but I'm most familiar with the ABI instruments. A very major effect is due to the amount of sample. Either too much or too little is quite bad for high quality sequencing.

-phage434-

lyok on Mon Jun 18 20:36:47 2012 said:


Its just because the first parts of the electrophoresis proces are not "clean" enough, still having more contamination etc that they dont respond well to the sequencing.
However: how come the fluorescence is more in the first bases ? Isnt the fluorescence staining done randomly or do you only stain the first bases?
And why are the first bases more packed?

as phage434 points out, early fragments are short and the gel in the capillary is optimized for larger fragments so separation is not as clear as later in the run. however, as i pointed out in an earlier post, you can obtain reliable data as close as 5 bases from the primer with better cleaning of the reaction products (at least, you can with the ceq sequencers, their gel formulations are different from the abi sequencers).
as for your questions regarding fluorescence...
sanger sequencing depends on dideoxy termination of the product nucleic acid. for the capillary sequencers, the dideoxy nucleotides are fluorescently labelled. once a dideoxy nucleotide is incorporated, chain extension is terminated. shorter chains are favored by this method although reliable data can be obtained for as long as 800 bases under standard conditions (more using modified electrophoresis conditions).

this wiki page can give you a better understanding: dna sequencing.

-mdfenko-

O k
Thanks a lot mdfenko and phage434.

I get it now. I was not thinking enough how the electrophoresis works.
I still find the cleaning part a bit weird. I find it weird that small pieces have more problems regarding the cleaning then larger fragments.
But I guess this is because of "the spreading" of "dirt" over larger vs smaller pieces.

-lyok-

"cleaning" is separating product(s) from left over reactants.

-mdfenko-