Protocol Online logo
Top : Forum Archives: : Molecular Biology

Potencial promoter seqences analysis -strange 5` sequence - Qustions that arised in my investigation of promoter seqence (Jan/08/2009 )

My question is about cloning human promoter and transcription generally.

......Using a dbTSS database, PromoSer and ENSEMBL browser I have localized the 5` upstream sequence of my gene of interest. As the promoter most of the literature and database understand the 1000bp upstream and 100bp downstream of the TSS. So I have copied this sequence and found no TATAbox but a Inr and DPE motifs present. To check if my sequence is a real promoter I have blasted it with EPD (Eucaryotic promoter database) to look for significant similarity –no such similarity was found. Later I have blasted about 4000 of 5` upstream sequence and found a very conservative 289bp region about 2200 upstream of the TSS – there is no TATAbox but two BRE motifs. To Find which of the sequences I mentioned is the promoter I checked for TF binding sites with MatInspector. The sequence near the TSS has more predicted TF binding sites but the one 2200bp upstream have them either. Both of the sequences are GC rich but the one near TSS is more typical CpG island. There are long ttttttttt repeats upstream of the sequence which aligns well with EPD (the one 2200 upstream of TSS).

The question I would like to answer :

1. Which of the sequences is the promoter
2. Why the sequence near TSS does not align with sequences from EPD but MatInspector show a lot of possible TF binding sites and there is a Inr and DPE motif present.
3. What is this thing 2200bp upstream of TSS – could it be a regulatory sequence so far or a promoter of another uknown gene, RNA?? what exactly can be found in EPD database – only core promoters or also enhancers - maby it is a enhancer??
4. Could a BRE motif exist without TATAbox?
5. A hallmark of what those tttttttttt reapeats upstream the “far away” promoter could be.

I would be glad for any help – thanks in advance rolleyes.gif !

-DzDz-

QUOTE (DzDz @ Jan 8 2009, 04:08 AM)
My question is about cloning human promoter and transcription generally.

......Using a dbTSS database, PromoSer and ENSEMBL browser I have localized the 5` upstream sequence of my gene of interest. As the promoter most of the literature and database understand the 1000bp upstream and 100bp downstream of the TSS. So I have copied this sequence and found no TATAbox but a Inr and DPE motifs present. To check if my sequence is a real promoter I have blasted it with EPD (Eucaryotic promoter database) to look for significant similarity –no such similarity was found. Later I have blasted about 4000 of 5` upstream sequence and found a very conservative 289bp region about 2200 upstream of the TSS – there is no TATAbox but two BRE motifs. To Find which of the sequences I mentioned is the promoter I checked for TF binding sites with MatInspector. The sequence near the TSS has more predicted TF binding sites but the one 2200bp upstream have them either. Both of the sequences are GC rich but the one near TSS is more typical CpG island. There are long ttttttttt repeats upstream of the sequence which aligns well with EPD (the one 2200 upstream of TSS).

The question I would like to answer :

1. Which of the sequences is the promoter
2. Why the sequence near TSS does not align with sequences from EPD but MatInspector show a lot of possible TF binding sites and there is a Inr and DPE motif present.
3. What is this thing 2200bp upstream of TSS – could it be a regulatory sequence so far or a promoter of another uknown gene, RNA?? what exactly can be found in EPD database – only core promoters or also enhancers - maby it is a enhancer??
4. Could a BRE motif exist without TATAbox?
5. A hallmark of what those tttttttttt reapeats upstream the “far away” promoter could be.

I would be glad for any help – thanks in advance rolleyes.gif !


I would still go for the one nearer the TSS as the promoter sequence, as it is a traditional understanding. You may want to make sure that..

1. Your TSS is real TSS.

2. There is no alternative spliced product - first exon of which start just downstream of the promoter with TATA sequence. You can do this by bioinformatics analysis of your gene structure using a genomic sequence that encompasses the distal promoter. The alternatively spliced transcript may even be described in one of the many databases.

3. I have no idea about BRE motif and TATA relation.

4. Many promoters are TATA less, it does not exclude a promoter from being promoter.

5. The distal promoter may be any of the following.

-Enhancer region.
-Promoter region for other gene coding for regulatory RNA, other smaller gene in same direction, or any size gene transcribed in reverse direction.
-For alternatively transcribed seq of your gene.
-Junk that happens to have TF binding sequences, with no relevance to regulation of anything.

-cellcounter-

Thanks for reply

Do you now if there is any method to test the TSS computationally - the TSS I guess is real one is placed a little upstream of the TSS from dbTSS database which is at beginning of the known 5 ` UTR. To my good knowledge there are no alternative splice variants of my protein.

The thing which makes me confused mostly is the fact that “promoter” near TSS which have lot of common promoter motives and CpG island does not align with samples from Eucaryotic Promoter Database and the ~220bp 2200 upstream of TSS matches almost perfectly?

About TSS – in Genomatix server Eldorado they use annotation “TSR” – transcription start region – how can I understand this – some new publications also use this term??

-DzDz-