how to design a kozak sequence in front of the start codon - (Sep/26/2011 )
hi folks
i am currently trying out a subcloining with a recombination system, and honestly this is my first experience with it, so i really need to get help from you guys, any advice will be appreciated much.
i am designing a primer of a kozak sequence followed with the start codon atg. I have read some literature saying that the best kozak sequence is gccgccacc , but i rarely see any plasmid has this kozak sequence, of course some other kozak sequences may also work perfectly. my question is is there any other rules to follow to design the kozak sequence in front of atg, if gccgccacc is the best kozak, why people chose other kozak rather than this? and do i need to put any linker between atg and kozak?
please help me out on this issue, i appreciate much.
thanks in advance
dan
So although Marily Kozak would argue otherwise, the "Kozak Sequence" is not considered a universal truth. There are actually a lot of things that affect the efficiency of translation initiation and the sequence around the start codon is only one of them. The short answer to your question is that the Kozak sequence I have seen most widely used is "ccAccatgG" which incorporates the two most critical residues which are "A" in the -3 position and "G" at the +1 position. The "G" at +1 is something that you may not have control over depending on your natural protein sequnce (but Gly or Ala is a very common second amino acid). I have seen a few people who use the full sequence you've described, but not many. You have to understand that a lot of the translation initation work was done by looking at either in vitro translation efficiency with point mutations or by surveying natural gene start sequences and comparing protein levels. This is not a very accurate way of doing this, since things like 5'UTR length, 5'UTR sequence, RNA structure downstream of the start codon, and optimal codon usage can all dramatically affect efficiency. Every gene is different and chances are you will have limited control over a significant number of factors. I have used the "CCACCATGG" sequence many times for many different proteins in the same expression system, and I have seen a WIDE range of protein expression levels, so I would argue that the protein sequence itself will help or hurt you all on its own.
The other thing you have to think about is what system you are working in. Bacteria? Yeast? Insect Cells? Mammalian Cells? This does make a difference, since the "Kozak sequence" is actually different depending on what cell type you're in. I would recommend you read this paper on start codon context to get a feel for what I'm talking about:
Cavener, D.R., and Ray, S.C., 1991. Eukaryotic start and stop translation sites. Nucleic Acids Res. 19, 3185‑3192
Best of Luck.