How to select candidate TFs for future study from the prediction results? - (Jan/21/2007 )
Dear friends and experts , I am trying to find transcription factors that can regulate my interested gene's expression. The following are my thoughts , what I have done and my troubles.
Advices and suggestions are really wecome. I will greatly appreciate your help and time.
Thank you very very much!
Basis:
High fat-diet increases X gene’s expression greatly (in rat).
Hypothesis:
Maybe there is one or some transcriptional factor (TF) that can regulate X gene's expression.
What has been done?:
1. Get X gene’s sequence, mRNA sequence from Ensembl or NCBI Genebank;
2. Get X gene’s promoter sequence from NCBI Genebank;
3. Analyze the sequence; find the extrons’ position and transcription start site;
4. Get X gene’s 5’ flanking sequence (, the upstream 1.5 kb sequence from the transcription start site);
5. Get the homology gene sequence in mouse;
6. Use ‘TFSEARCH’, ‘PROMOTER 2.0’, ‘FirstEF’, ‘rVISTA’ and other online programs to predict the possible transcript factor binding sites (TFBSs) and transcription factors (TFs);
7. Compare the results got from different programs;
Results:
1. The results from ‘rVISTA’:
1). 58 conserved and 58 aligned transcription factor binding sites (TFBS) were identified ;
2). got 3 conserved consensus TFs from ‘TFSEARCH’ and ‘rVISTA’ and 11 conserved TFs from ‘rVISTA’.
Troubles:
1). How to evaluate the effectiveness of ‘TFSEARCH’ and ‘rVISTA’? Are there any references about their accuracy?
2). How to select the candidate TFs for the future study from the results?
3). I want to clearly understand this kind of study’s technology road. Can some experts in this field recommend several good papers?
Still nobody answer.
Is it a very hard question?
Yes, it is a hard question...
In my last project I had to do exactly the same. After doin the bioinformatically search for putative TF binding I got I neverending list of TF's and I can say that depending on which program you use you will get different results, hence bioinformatics is not so established as people send you have to opportunities.
1. You can either try to focus on TF's that sound reasonable by checking expression and regulation in your tissue. If you that end up with a few one. Order the oligosequence corresponding to your promoter and do EMSA.
2. That is what I did. Forget about the TF search results and perform DNase I footprint with your regulatory sequences. (Needs some time for establishing and is a really tricky method) When you get a footprint you will have the sequence where protein interaction occours. Now check the sequence and look at your TF search results which factor binds to these sequences. Confirm by EMSA. Best thing is to do a supershift to prove specificity
I am not sure about what i suggest is right.
Why not try to do some deletion constructs of your promoter region? i presumed you have got the reporter construct incorporated by the promoter of your target gene. If you chop it down to a few bps and do the luc assay, maybe it can narrow your focus of TF. it is quicker.
Thank you very much for the suggestions.