Textbook Canon: Expressing Any Genes in Bacteria - (Aug/09/2012 )
I am having problems understanding this: if you want to express a certain gene , then why use a low copy number plasmid?
(I can understand this for proteins that might be toxic for cells... but in general , isnt it better to have a high copy number and thus more DNA and more protein?)
Altough, perhaps we are discussing something different as Julio-Claudian also noticed at the end of his post.
Also the followng part:
In the K strain your ligated plasmid would be nicely transformed since there are minimum dnases and recombinases to affect it. These strains grow slow but have great transformation efficiency. They are also very easy to make competent using the Hanahan method which was actually developed for K strains. These strains are used mainly for plasmid production for mini/midi preping.
In the B strain, your protein will be nicely expressed and protected from the proteases, so it can accumulate up to the harvest time. Minimum preoteases to chop your protein up. Grow faster but have terrible transformation efficiency.
You do mean that in a cloning vector the "inserted" plasmid DNA is more or less "left alone" and not degraded/inserted via (homologous) recombinase while in an expression vector, this plasmid DNA has more change of being degraded/inserted into the genome , right?
So its save to conclude that in cloning vector the main characterisic is that the plasmid DNA is kept stable and not altered.
And that in an expression vector its the protein processing that is kept stable=> no break down of proteins (or less proteins break down)
Cloning strains, such as DH5a, Top10, etc. have important knockouts in specific genes. The recA gene knockout eliminates much of the accidental recombination of the strain. The endA knockout eliminaes an important endonuclease, which can rapidly trash DNA during and after cell lysis. The knockout of the hsdR genes remove restriction enzymes which can cut transformed DNA before it can be methylated. Typically, few of these genes are knocked out in the expression strains, such as BL21. There, important knockouts include the lon protease, which can trash proteins during cell lysis.
Wow, the more specific I try to be, you guys raise more questions Sorry for taling so long to come back to you.
I will start with Julio-Claudian:
I am personally not the most careful person with nomenclature. I state this is the beginning to make you aware that I might be off with some things. I understand by ORF the sequence from the start codon to the stop codon. Some people include in the definition of the ORF the stop codon, others do not include. Hence, ORFs code for proteins.
For the question related to is ORF=gene, I will give you a riddle (actually is the fact that made me also think about this a few years back): C. elegans has ~20000 different proteins but ~36000 genes. Where does the difference come from? You (all of you except phage434) should attempt an answer. You will receive the answer after at least one attempt is done. I do this because if I give you all the answers, it goes in through one ear, and out through the other. Thinking it yourself will make you never forget the answer.
However, most of us refer to genes as those DNAs coding for proteins.
If you figure out the answer for the above riddle, you will realize that not always downstream the promoter you need an ORF.
Hint: promoter/terminator are needed for transcription; rbs, start/stop codons are needed for translation. Think about the product of each process.
Reading your comment, I have to make clear that you need some space between promoter, rbs, ORF, terminator. For some of them you even need specific distances e.g. rbs has to be 6-8 bases upstream the start codon. So, it is crucial to have the optimum distances in order to have your protein expressed. Thankfully, the people who sell those expression vectors have it all figured out for us to clone it directly in the MCS and be sure that our protein will be expressed. Usually we don't bother ourselves with this type of questions: we take care that we have only the ORF from ATG to TAA (no additional promoters ) and we insert this in the MCS. BTW: sometimes when you clone in commercial vectors using the MCS you get some additional amino acids at the termini, which are not the end of the world. It took me a while until I gave up trying to clone my inserts in such a way that I do not have this problem. (this appears especially when you try to clone in vectors that add a tag to your protein). But let's keep on discussing for the sake of understanding what's happening in our experiments (don't think that I understand everything happening on the molecular level, I am still learning myself, I can only help with what I know)
Cloning vector vs expression vector: yes you are right; the difference is in having the promoter, rbs, terminator. I never used a cloning vector. The synthetic gene companies send the genes under the form of gene inserted in a cloning vector. Also some people working in generating DNA libraries are using cloning vectors because it is easier to handle (they are smaller, easier to transform, do not express the proteins from the genes, which make the bacteria happier).
With the origin of replication: I do not know why for expression you use the low copy number plasmids. I have just noticed that all my expression vectors have the low copy number plasmids. The reason that was given to me is that having too many copies per cell would lead to overproduction of the protein which is toxic to the cell. However, having strong promoters such as T7 has the same result. But maybe it is better not to change to many variables at once. For example, when I test which plasmid is better for my protein, I check different promoter strengths (among other things). If I would have different stregths of replication origins, I would change 2 variables in between experiments at once and I wouldn't know what is the cause for the difference.
@ lyok:
You seem to confuse strains and vectors: in the cloning vector, as Julio-Claudian observed, there is no promoter for your gene. Take care, strains are the living things with all the proteins/enzymes, DNA, RNA and a lipid membrane around while the vectors are the circular DNA. I know what you mean, but others might miss the message if we do not call the things what they are
Except naming bacteria vectors, you got it right.
@phage434
I have a question regarding what you said:
phage434 on Sat Aug 11 18:59:25 2012 said:
The endA knockout eliminaes an important endonuclease, which can rapidly trash DNA during and after cell lysis. ... expression strains, such as BL21. There, important knockouts include the lon protease, which can trash proteins during cell lysis.
For both of them you say that the degradation of DNA/protein can happen during the cell lysis. What keeps them from degrading it in the cell? I always thought that the only one that is acting upon cell lysis is the OmpT since it is the outer membrane protease while lon is found in the cytosol. Moreover, I thought that lon protease is degrading the aggregated proteins in the cytosol, among other important functions. This is why, I assumed that if you have too many proteins (like in the case of overexpression), lon might degrade them.
Andreea
ascacioc on Sun Aug 12 23:33:55 2012 said:
For the question related to is ORF=gene, I will give you a riddle (actually is the fact that made me also think about this a few years back): C. elegans has ~20000 different proteins but ~36000 genes. Where does the difference come from?
Hint: promoter/terminator are needed for transcription; rbs, start/stop codons are needed for translation. Think about the product of each process.
Love riddles and I'll take a stab at this (scoffers, hold your peace ):
<*>This could have everything to do with the nature of eukaryotic genes where the RNA transcripts undergo a splicing step before translation. So presence of non-coding gene/DNA would be sort of "filtered" in the splicing step before those encoding a protein would finally be translated.
<*>Using the hint, anything that comes under the control (within the confines) of the promoter/terminator pair will be transcribed. But the rbs might not be present at all. Taking this a step further, the absence of the rbs and start and stop codons could be due to a 'frame-shifted translation' (ok, I have no idea what that's called. And we arrived at the two-promoter situation).
Getting a little help, the difference in the number of genes and proteins are caused by presence of "transposons, pseudogenes, and other artifacts".
This article in EMBO reports states that
and does seem to help my explanation in (2) that everything gets transcribed but not everything gets translated. I suspect, whilst these are facts, you're looking for something simpler with regards to promoter, terminator, rbs, and start/stop codons with the processes involved.
Yes I am looking for something simpler. Occam's razor: the right explanation is the simplest one Let's stay with prokaryotes. They are simpler:) You have your answer in the article from EMBO reports. Hint: not introns.
Other hints: which types of RNA you know? which type of RNA is translated to proteins?
Andreea
And to not wonder off bacteria, I will revise my sentence for E. coli K-12: 4,377 genes, 4,290 proteins.
Interesting.
*looks left and right* I think I'll have another go at it.
If all DNA/genes were destined for proteins, then there wouldn't be any tRNAs and rRNAs left. Or RNA I & II, and miRNAs for regulation
That should account for the difference in the number of genes and proteins.
I sure hope this is the 'simpler' answer
Yup:) You got it right. Not all genes make mRNA --> proteins, some genes make tRNAs, rRNAs, RNAis, piwiRNAs etc. Actually, C. elegans is the organism in which RNAis and piwiRNAs were discovered.
Why I did not accept the introns answer: if you have a gene with introns, it leads to a spliced mRNA (or several) and to a protein (or several spliced variants). In the simple terms: 5-90% of a gene gets to a protein; however when we count this gene to the toal amount of genes we do not say 0.5 gene + rest; we still say this gene coded for a protein. In the EMBO reports they state in other words that 5% of the DNA in genes leads to proteins, not 5% of number of genes lead to proteins. You know what I mean?
Andreea