DNA sequencing analysis - DNA sequencing analysis (Jan/27/2009 )
hi admin and chatter,
i need help on dna sequencing analysis
i just got the result
this sequencing are taking at NCBI, code AF012789 for gen cyto B
ORIGIN
1 tttggctctc ttttaggact ctgcttaatt acacaaatcc tcacagggct atttcttgct
61 atacattaca catcagacat ttctaccgcc ttctcatccg tggcacacat ttgccgagac
121 gtaaactatg gatgactaat tcgtaacctc cacgccaacg gcgcctcatt tttctttatt
181 tgtatttatt tccacatcgg ccgaggccta tactacggct cctacctcta taaagagaca
241 tgaaatgtcg gcgtaatact actgctacta gtgatgatga cggcgttcgt agggtacgtt
301 ctaccc
//
this sequence are only 306bp
however last time i test for the same gen cyto B but using this primers
L14841 <5’-AAAAAGCTTCCATCCAACATCTCAGCATGATGAAA-3’> (FORWORD)
and
H15149 <5’-AAACTGCAGCCCCTCAGAATGATATTTGTCCTCA-3’> (REVERSE)
and the result is
> sequence 1 (L14841/FORWORD) are 334bp
AGCTCCTCACTGAGGACTCTGTTTAATCGCACAAATTGTCACAGGGCTATTCCTCGCAATACACTACACGTCTGACATTA
CCACCGCCTTCTCATCCGGAGCCCACATCTGGCGAGACGTCAATTACGGTTGACTAATCCGAAACCTCCACGCCAACGGA
GCCTCTTTCTTCTTTATCTGCATTTATTTCCACATCGGACGAGGACTATACTACGGCTCCTACCTATATAAAGAAACATG
AAACATCGGGGTAATTCTTCTCCTACTAGTTATAATAACCGCCTTTGTAGGTTATGTATTACCCTGAGGACAAATATCAT
TCTGAGGGGCTGCA
> sequence 1 (H15149/REVERSE) are 347bp
GGCGTTATTTATACATATACCTACAAAGGTGGTTTTATAACTAGTAAGAGAAGAATTACCCCGATGTTTCATGTTTCTTT
ATATAGGTAGGAGCCGTAGTATAGTCCTCGTCCGATGTGGAAATAAATGCAGATAAAGAAAAAAGAGGCTCCGTTGGCGT
GGAGGTTTCGGATTAGTCAACCGTAATTGACGTCTCGGCAGATGTGGGCTACGGATGAGAAGGCGGTGGTAATGTCAGAC
GTGTAGTGTATTGCGAGGAATAGGCCTGTGACAATTTGTGCGATTAAGCAGAGTCCCAGTAGGGAGCCAAAGTTTCATCA
TGCTGAGATGTTGGATGGAAGTTTTAA
> sequence 2 (L14841/FORWORD) are 239bp only
AGGAAAAGAGGAAATTCTGTAGGGGGGCACAAATTGACCAGGCCTATTCCTCGCAATACACTACACGTCTGACATTACCA
CCGCCTTCTCATCCGTAGCCCACATCTGCCGAGACGTCAATTACGGTTGACTAATCCGAAACCTCCACGCCAACGGAGCC
TCTTTCTTCTTTATCTGCATTTATTTCCACATCGGACGAGGACTATACTACGGCTCCTACCTATATAAAGAAACATGAAA
CATCGGGGTAATTCTTCTCCTACTAGTTATAATAACCGCCTTTGTAGGTTATGTATTACCCTGAGGACAAATATCATTCT
GAGGGGCTG
> sequence 2 (H15149/REVERSE) are 377bp
ACTCTACATCACAGGAGTTTTTGTAAGAACTATTTTAGAGAAGAATTACCCCGATGTTTCATGTTTCTTTATATAGGTAG
GAGCCGTAGTATAGTCCTCGTCCGATGTGAAAATAAATGCAAATAAAGAAAAAAGAGGCTCCGTTGGCGTGGAGGTTTCG
GATTAGTCAACCGTAATTGACGTCTCGGCAGATGGGGGCTACGGATGAGAAGGCGGTGGTAATGTCAGACGTGTAGTGTA
TTGCGAGGAATAGGCCTGTGACAATTTGTGCGATTAAGCAGAGTCCCAGTAGGGAGCCAAAGTTTCATCATGCTGAGATG
TTGGATGGAAGTTTTAA
> sequence 3 (L14841/FORWORD) are 334 bp
ACAAACTGTATAATCTGTTAGGGGCTCAAATTTACACAGGCCTATTCCTCGCAATACACTACACGTCTGACATTACCACC
GCCTTCTCATCCGTAGCCCACATCTGCCGAGACGTCAATTACGGTTGACTAATCCGAAACCTCCACGCCAACGGAGCCTC
TTTCTTCTTTATCTGCATTTATTTCCACATCGGACGAGGACTATACTACGGCTCCTACCTATATAAAGAAACATGAAACA
TCGGGGTAATTCTTCTCCTACTAGTTATAATAACCGCCTTTGTAGGTTATGTATTACCCTGAGGACAAATATCATTCTGA
GGGGCTGCAGTTAA
> sequence 3 (H15149/REVERSE) are 347bp
GNNNNGGGATAACTCCTCTCCTGCTGGGATGGTGTAAGAACTATTTTAGAGAAGAATTACCCCGATGTTTCATGTTTCTT
T
ATATAGGTAGGAGCCGTAGTATAGTCCTCGTCCGATGTGGAAATAAATGCAGATAAAGAAAAAAGAGGCTCCGTTGGCGT
GGAGGTTTCGGATTAGTCAACCGTAATTGACGTCTCGGCAGATGTGGGCTACGGATGAGAAGGCGGTGGTAATGTCAGAC
GTGTAGTGTATTGCGAGGAATAGGCCTGTGACAATTTGTGCGATTAAGCAGAGTCCCAGTAGGGAGCCAAAGTTTCATCA
TGCTGAGATGTTGGATGGAATTTTTA
what's is the problem, all the basepair not the same ... izzit the sequence should be the same as in GENEBANK (NCBI)
sequencers make mistakes. you need to get a consensus sequence and compare to that from ncbi.
usually the differences are caused by noise at the beginning and end of the sequence, depending on reaction strength.
keep in mind that sequence data does not usually start at the end of the primer. depending on how you clean the reaction you can start from 5-20 bases after the primer.
You need a concensus sequence and an aligment tool (ie Omiga, clustal x).
here my concensus sequence tree, what the conclusion from that's tree?
zack on Jan 29 2009, 11:49 AM said:
here my concensus sequence tree, what the conclusion from that's tree?
I Know you're from malaysia .
Maybe your DNA template is dirty thus causing subsequent problem either in sequencing or the running of sequencing product.
1. What university are you from?
2. What equipment did you use to check your DNA purity? ( gel can't tell anything about purity , perhaps maybe etoH contamination)
3. And attach your format in .abi format, 1st base is notorious for giving "fake" sequence in the .seq format. A lousy sequence could still generate " good" .seq file if the base calling software is not stringent enough.
4. what's the first base remark on you seq?
5. Your seq might not be exactly the same as the seq obtained from gen bank unless it's from the exact same source.remember evolution and natural selection.
cheers'
Ming
UTM Skudai
hanming86 on Feb 4 2009, 04:18 AM said:
zack on Jan 29 2009, 11:49 AM said:
here my concensus sequence tree, what the conclusion from that's tree?
I Know you're from malaysia .
Maybe your DNA template is dirty thus causing subsequent problem either in sequencing or the running of sequencing product.
1. What university are you from?
2. What equipment did you use to check your DNA purity? ( gel can't tell anything about purity , perhaps maybe etoH contamination)
3. And attach your format in .abi format, 1st base is notorious for giving "fake" sequence in the .seq format. A lousy sequence could still generate " good" .seq file if the base calling software is not stringent enough.
4. what's the first base remark on you seq?
5. Your seq might not be exactly the same as the seq obtained from gen bank unless it's from the exact same source.remember evolution and natural selection.
cheers'
Ming
UTM Skudai
hi Ming. yes u are right.. i;m from malaysia. and still study at UPM, actually i'm not from molecular background. but i still learning more bout it. the result that;s i send to first base ... i dont know what the conclusion becoz the sample not in the same bp size.
2. i'm using hitachi U2910
3. u can download the attachment
4.
it's my pleasure if u can give some opinion and advice on the .abi format
thank you ming
zack on Feb 4 2009, 06:16 AM said:
hanming86 on Feb 4 2009, 04:18 AM said:
zack on Jan 29 2009, 11:49 AM said:
here my concensus sequence tree, what the conclusion from that's tree?
I Know you're from malaysia .
Maybe your DNA template is dirty thus causing subsequent problem either in sequencing or the running of sequencing product.
1. What university are you from?
2. What equipment did you use to check your DNA purity? ( gel can't tell anything about purity , perhaps maybe etoH contamination)
3. And attach your format in .abi format, 1st base is notorious for giving "fake" sequence in the .seq format. A lousy sequence could still generate " good" .seq file if the base calling software is not stringent enough.
4. what's the first base remark on you seq?
5. Your seq might not be exactly the same as the seq obtained from gen bank unless it's from the exact same source.remember evolution and natural selection.
cheers'
Ming
UTM Skudai
hi Ming. yes u are right.. i;m from malaysia. and still study at UPM, actually i'm not from molecular background. but i still learning more bout it. the result that;s i send to first base ... i dont know what the conclusion becoz the sample not in the same bp size.
2. i'm using hitachi U2910
3. u can download the attachment
4.
it's my pleasure if u can give some opinion and advice on the .abi format
thank you ming
Hey i think your seq can align perfectly well . There seems to be no problem . extra seq flanking the PCR product is expected in sequencing .
I didn't have the time actually to look if the primer seq is within your PCR or not. i need to go eat lunch first now.
And the seq difference from NCBI should be expected if your source is from other individual.
more details on your work would help
k thanks
cheers
-ming
If you run the sequences in ClustalW (or any other alignment tool) you'll see how close the sequences are.
It is very clear that the odd numbers are one type and that the even ones are the other type.
seq4 -----------ACTCTACATCACAGGAGTTTTTGTAAGAACTATTTTAGAGAAGAATTAC 49
seq6 GNNNNGGGATAACTCCTCTCCTGCTGGGATGGTGTAAGAACTATTTTAGAGAAGAATTAC 60
seq2 GGCGTTATTTATACATATACCTACAAAGGTGGTTTTATAACTAGTA-AGAGAAGAATTAC 59
seq3 -------AGGAAAAGAGGAAATTCTGTAGGGGGGCACAAATTGAC-CAG-GCCTATTCCT 51
seq5 ---------ACAAACTGTATAATCTGTTAGGGG-CTCAAATTTACACAG-GCCTATTCCT 49
seq1 -----AGCTCCTCACTGAGGACTCTGTTTAATCGCACAAATTGTCACAG-GGCTATTCCT 54
** * ** * * *
seq4 CCCGATGT--TTCATGTTTCTTTATATAGGTAGGAGCCGTAGTATAGTCCTCGTCCGATG 107
seq6 CCCGATGT--TTCATGTTTCTTTATATAGGTAGGAGCCGTAGTATAGTCCTCGTCCGATG 118
seq2 CCCGATGT--TTCATGTTTCTTTATATAGGTAGGAGCCGTAGTATAGTCCTCGTCCGATG 117
seq3 CGCAATACACTACACGTCTGACATTACCACCGCCTTCTCATCCGTAGCCCACATCTGCCG 111
seq5 CGCAATACACTACACGTCTGACATTACCACCGCCTTCTCATCCGTAGCCCACATCTGCCG 109
seq1 CGCAATACACTACACGTCTGACATTACCACCGCCTTCTCATCCGGAGCCCACATCTGGCG 114
* * ** * ** ** * ** * ** ** * ** * *
seq4 TGAAAATAAATGCAAATAAAGAAAAAAGAGGCTCCGTTGGCGTGGAGGTTTCGGATTAGT 167
seq6 TGGAAATAAATGCAGATAAAGAAAAAAGAGGCTCCGTTGGCGTGGAGGTTTCGGATTAGT 178
seq2 TGGAAATAAATGCAGATAAAGAAAAAAGAGGCTCCGTTGGCGTGGAGGTTTCGGATTAGT 177
seq3 AGACGTCAATTACGGTTGACTAATCCGAAACCTCCACGCCAACGGAGCCTCTTTCTTCTT 171
seq5 AGACGTCAATTACGGTTGACTAATCCGAAACCTCCACGCCAACGGAGCCTCTTTCTTCTT 169
seq1 AGACGTCAATTACGGTTGACTAATCCGAAACCTCCACGCCAACGGAGCCTCTTTCTTCTT 174
* ** * * * * ** * **** **** * ** *
seq4 CAACCGTAATTGACGTCTCGGCAGATGGGGGCTACGGATGAGAAGGCGGTGGTAATGTCA 227
seq6 CAACCGTAATTGACGTCTCGGCAGATGTGGGCTACGGATGAGAAGGCGGTGGTAATGTCA 238
seq2 CAACCGTAATTGACGTCTCGGCAGATGTGGGCTACGGATGAGAAGGCGGTGGTAATGTCA 237
seq3 TATCTGCATTTATTTCCACATCGGACGAGGACTATACTACGGCTCCTACCTATATAAAGA 231
seq5 TATCTGCATTTATTTCCACATCGGACGAGGACTATACTACGGCTCCTACCTATATAAAGA 229
seq1 TATCTGCATTTATTTCCACATCGGACGAGGACTATACTACGGCTCCTACCTATATAAAGA 234
* * * * ** * * * ** * ** *** * ** *
seq4 GACGTGTAGTGTATTGCGAGGAATAGGCCT----GTGACAATTTGTGCGATT--AAGCAG 281
seq6 GACGTGTAGTGTATTGCGAGGAATAGGCCT----GTGACAATTTGTGCGATT--AAGCAG 292
seq2 GACGTGTAGTGTATTGCGAGGAATAGGCCT----GTGACAATTTGTGCGATT--AAGCAG 291
seq3 AACATGAAACATCGGGGTAATTCTTCTCCTACTAGTTATAATAACCGCCTTTGTAGGTTA 291
seq5 AACATGAAACATCGGGGTAATTCTTCTCCTACTAGTTATAATAACCGCCTTTGTAGGTTA 289
seq1 AACATGAAACATCGGGGTAATTCTTCTCCTACTAGTTATAATAACCGCCTTTGTAGGTTA 294
** ** * * * * * *** ** * *** ** ** * *
seq4 AGTCCCAGTAGGGAGCCAAAGTTTCATCATGCTGAGATGTTGGATGGAAGTTTTAA 337
seq6 AGTCCCAGTAGGGAGCCAAAGTTTCATCATGCTGAGATGTTGGATGGAATTTTTA- 347
seq2 AGTCCCAGTAGGGAGCCAAAGTTTCATCATGCTGAGATGTTGGATGGAAGTTTTAA 347
seq3 TGTATTACCCTGAGGACAAA----TATCATTCTGAGGGGCTG-------------- 329
seq5 TGTATTACCCTGAGGACAAA----TATCATTCTGAGGGGCTGCAGTTAA------- 334
seq1 TGTATTACCCTGAGGACAAA----TATCATTCTGAGGGGCTGCA------------ 334
** * * * **** ***** ***** * **
If you Align two sequences (NCBI) an odd number and an even number you'll see that they're the same sequence but frum opiset sides!!!
Length=347
Score = 483 bits (261), Expect = 4e-141
Identities = 288/300 (96%), Gaps = 5/300 (1%)
Strand=Plus/Minus <--
Query 2 GCT-CCTCACTGAGGACTCTGTTTAATCGCACAAATTGTCACAGGGCTATTCCTCGCAAT 60
Sbjct 307 GCTCCCT-ACTG-GGACTCTGCTTAATCGCACAAATTGTCACAGGCCTATTCCTCGCAAT 250
Query 61 ACACTACACGTCTGACATTACCACCGCCTTCTCATCCGGAGCCCACATCTGGCGAGACGT 120
Sbjct 249 ACACTACACGTCTGACATTACCACCGCCTTCTCATCCGTAGCCCACATCTGCCGAGACGT 190
Query 121 CAATTACGGTTGACTAATCCGAAACCTCCACGCCAACGGAGCCTCTTTCTTCTTTATCTG 180
Sbjct 189 CAATTACGGTTGACTAATCCGAAACCTCCACGCCAACGGAGCCTCTTTTTTCTTTATCTG 130
Query 181 CATTTATTTCCACATCGGACGAGGACTATACTACGGCTCCTACCTATATAAAGAAACATG 240
Sbjct 129 CATTTATTTCCACATCGGACGAGGACTATACTACGGCTCCTACCTATATAAAGAAACATG 70
Query 241 AAACATCGGGGTAATTCTTCTCCTACTAGTTATAATAACCGCCTTTGTAGGT-TATGTAT 299
Sbjct 69 AAACATCGGGGTAATTCTTCTCTTACTAGTTATAA-AACCACCTTTGTAGGTATATGTAT 11
molgen on Feb 12 2009, 09:11 AM said:
Length=347
Score = 483 bits (261), Expect = 4e-141
Identities = 288/300 (96%), Gaps = 5/300 (1%)
Strand=Plus/Minus <--
Query 2 GCT-CCTCACTGAGGACTCTGTTTAATCGCACAAATTGTCACAGGGCTATTCCTCGCAAT 60
Sbjct 307 GCTCCCT-ACTG-GGACTCTGCTTAATCGCACAAATTGTCACAGGCCTATTCCTCGCAAT 250
Query 61 ACACTACACGTCTGACATTACCACCGCCTTCTCATCCGGAGCCCACATCTGGCGAGACGT 120
Sbjct 249 ACACTACACGTCTGACATTACCACCGCCTTCTCATCCGTAGCCCACATCTGCCGAGACGT 190
Query 121 CAATTACGGTTGACTAATCCGAAACCTCCACGCCAACGGAGCCTCTTTCTTCTTTATCTG 180
Sbjct 189 CAATTACGGTTGACTAATCCGAAACCTCCACGCCAACGGAGCCTCTTTTTTCTTTATCTG 130
Query 181 CATTTATTTCCACATCGGACGAGGACTATACTACGGCTCCTACCTATATAAAGAAACATG 240
Sbjct 129 CATTTATTTCCACATCGGACGAGGACTATACTACGGCTCCTACCTATATAAAGAAACATG 70
Query 241 AAACATCGGGGTAATTCTTCTCCTACTAGTTATAATAACCGCCTTTGTAGGT-TATGTAT 299
Sbjct 69 AAACATCGGGGTAATTCTTCTCTTACTAGTTATAA-AACCACCTTTGTAGGTATATGTAT 11
what that;s mean.. simple explanation pls