intornic & intergenic EST? - (May/22/2007 )
I have about 90 differentially regulated EST clones of unknown identity. Many are singletons and some of them I could extend the lenght of the sequences by clustering against other EST. When blasting them aginst the ENSEMBL or UCSC Genome browser, most of the them fall onto the intronic regions of annotated genes while the rest are aligning to the intergenic spaces. Additionally, some of those in the intronic region are in the reverse strand of the annotated genes. I'll appreciate any comment on the following:
1. Can EST have intronic of intergenic sequences? If yes, are they possibly alternative variants of a gene and how can I test them initially using in silico method? If not, are these intronic/intergenic EST just contanmination or lousy sequences? But them how do I explain the differential regulation?
2. If my EST sequence aligns to the reverse strand of an annotated gene, is that EST still part of the gene or is it part of the gene in the reverse strand?
Thank you very much
1. intronic or intergenic sequence is not part of the EST if it is true EST then should represent an exon, as u suggest could be a previously undefined exon, but that should be borne out by looking at the ESTs in ucsc or ensembl... (ie: I would suspect error or contamination if there are not any other EST, spliced EST or mRNAs that show exons in the region... can also look at conservation, highly conserved sequences more likely to be exon = if the sequence is highly conserved then I might assume it is an exon without the other supporting bioinformatic evidence.)
2. That depends on the direction you sequenced, you may or may not know if you have sequenced the positive or negative strand of your EST clone if you get the positive strand sequence and find it in the negative orientation then it is part of the gene cluster on the reverse strand, positive strand + positive sequence = gene cluster on the forward strand, if it is unknown then your best bet is to assume it may be either positive or negative sequence and in this case it could belong to a gene on either strand... you pick the most likely strand based on surrounding genes, other ests that have been spliced etc... I think you should search with both forward and reverse complement sequences to make sure that you get the same hit...
HTH and good luck!
Beccaf22, thanks for your comment.
I have tried ESTScan and a few of these sequences seem to be coding. Assuming they may be a novel exon, what may be the best way to check if they can actually splice with any of the know downstream exon to produce a gene?
I have tried ESTScan and a few of these sequences seem to be coding. Assuming they may be a novel exon, what may be the best way to check if they can actually splice with any of the know downstream exon to produce a gene?
you can check to see if there are any spliced ESTs in the database that include that sequence, or you could design primers between the exons in question and PCR from cDNA samples, make sure there is no gDNA contamination though...
good luck..