The Flow of Information from Stored to Active Form
The cell can be viewed as a unit that assembles resources from the environment into biochemically functional molecules and organizes these molecules in three-dimensional space in a way that allows cellular growth and replication. In order to carry out this organizational process, a cell must have a biosynthetic means to assemble resources into useful molecules, and it must contain the information required to produce the biosynthetic and structural machinery. DNA serves as the stored form of this information, whereas protein is its active form. Although there are thousands of different proteins in cells, they either serve a structural role or are enzymes that catalyze the biosynthetic reactions of a cell. Following the discovery of the structure of DNA in 1953 by James Watson and Francis Crick, scientists began to study the process by which the information stored in this molecule is converted into protein.
Proteins are linear, functional molecules composed of a unique sequence of amino acids. Twenty different amino acids are used as the protein building blocks. Although the information for the amino acid sequence of each protein is present in DNA, protein is not synthesized directly from this source. Instead, RNA serves as the intermediate form from which proteins are synthesized. RNA plays three roles during protein synthesis. Messenger RNA (mRNA)
contains the information for the amino acid sequence of a protein. Transfer RNAs (tRNAs)
are small RNA molecules that serve as adapters that decipher the coded information present within an mRNA and bring the appropriate amino acid to the polypeptide as it is being synthesized. Ribosomal RNAs (rRNAs)
act as the engine that carries out most of the steps during protein synthesis. Together with a specific set of proteins, rRNAs form ribosomes that bind the mRNA, serve as the platform for tRNAs to decode an mRNA, and catalyze the formation of peptide bonds between amino acids. Each ribosome is composed of two subunits: a small (or 40s) and a large (or 60s) subunit, each of which has its own function. The “s” in 40s and 60s is an abbreviation for Svedberg units, which are a measure of how quickly a large molecule or complex molecular structure sediments (or sinks) to the bottom of a centrifuge tube while being centrifuged. The larger the number, the larger the molecule.
Like all RNA, mRNA is composed of just four types of nucleotides: adenine (A), guanine (G), cytosine (C), and uracil (U). Therefore, the information in an mRNA is contained in a linear sequence of nucleotides that is converted into a protein molecule composed of a linear sequence of amino acids. This process is referred to as “translation,” since it converts the “language” of nucleotides that make up an mRNA into the “language” of amino acids that make up a protein. This is achieved by a three-letter genetic code in which each amino acid in a protein is specified by a three-nucleotide sequence in the mRNA called a codon. The four possible “letters” means that there are sixty-four possible three-letter “words.” As there are only twenty amino acids used to make proteins, most amino acids are encoded by several different codons. For example, there are six different codons (UCU, UCC, UCA, UCG, AGU, and AGC) that specify the amino acid serine, whereas there is only one codon (AUG) that specifies the amino acid methionine. The mRNA, therefore, is simply a linear array of codons (that is, three-nucleotide “words” that are “read” by tRNAs together with ribosomes). The region within an mRNA containing this sequence of codons is called the coding region.
Before translation can occur in eukaryotic cells, mRNAs undergo processing steps at both ends to add features that will be necessary for translation. (These processing steps do not occur in prokaryotic cells.) Nucleotides are structured such that they have two ends, a 5′ and a 3′ end, that are available to form chemical bonds with other nucleotides. Each nucleotide present in an mRNA has a 5′ to 3′ orientation that gives a directionality to the mRNA so that the RNA begins with a 5′ end and finishes in a 3′ end. The ribosome reads the coding region of an mRNA in a 5′ to 3′ direction. Following the synthesis of an mRNA from its DNA template, one guanine is added to the 5′ end of the mRNA in an inverted orientation and is the only nucleotide in the entire mRNA present in a 3′ to 5′ orientation. It is referred to as the cap. A long stretch of adenosine is added to the 3′ end of the mRNA to make what is called the poly-A tail.
Typically, mRNAs have a stretch of nucleotide sequence that lies between the cap and the coding region. This is referred to as the leader sequence and is not translated. Therefore, a signal is necessary to indicate where the coding region initiates. The codon AUG usually serves as this initiation codon; however, other AUG codons may be present in the coding region. Any one of three possible codons (UGA, UAG, or UAA) can serve as stop codons that signal the ribosome to terminate translation. Several accessory proteins assist ribosomes in binding mRNA and help carry out the required steps during translation.
The Translation Process: Initiation
Translation occurs in three phases: initiation, elongation, and termination. The function of the 40s ribosomal subunit is to bind to an mRNA and locate the correct AUG as the initiation codon. It does this by binding close to the cap at the 5′ end of the mRNA and scanning the nucleotide sequence in its 5′ to 3′ direction in search of the initiation codon. Marilyn Kozak identified a certain nucleotide sequence surrounding the initiator AUG of eukaryotic mRNAs that indicates to the ribosome that this AUG is the initiation codon. She found that the presence of an A or G three nucleotides prior to the AUG and a G in the position immediately following the AUG were critical in identifying the correct AUG as the initiation codon. This is referred to as the “sequence context” of the initiation codon. Therefore, as the 40s ribosomal subunit scans the leader sequence of an mRNA in a 5′ to 3′ direction, it searches for the first AUG in this context and may bypass other AUGs not in this context.
Nahum Sonenberg demonstrated that the scanning process by the 40s subunit can be impeded by the presence of stem-loop structures present in the leader sequence. These form from base pairing between complementary nucleotides present in the leader sequence. Two nucleotides are said to be complementary when they join together by hydrogen bonds. For instance, the nucleotide (or base) A is complementary to U, and these two can form what is called a “base pair.” Likewise, the nucleotides C and G are complementary. Several accessory proteins, called eukaryotic initiation factors (eIFs), aid the binding and scanning of 40s subunits. The first of these, eIF4F, is composed of three subunits called eIF4E, eIF4A, and eIF4G. The protein eIF4E is the subunit responsible for recognizing and binding to the cap of the mRNA. The eIF4A subunit of eIF4F, together with another factor called eIF4B, functions to remove the presence of stem-loop structures in the leader sequence through the disruption of the base pairing between nucleotides in the stem loop. The protein eIF4G is the large subunit of eIF4F, and it serves to interact with several other proteins, one of which is eIF3. It is this latter initiation factor that the 40s subunit first associates with during its initial binding to an mRNA.
Through the combined action of eIF4G and eIF3, the 40s subunit is bound to the mRNA, and through the action of eIF4A and eIF4B, the mRNA is prepared for 40s subunit scanning. As the cellular concentration of eIF4E is very low, mRNAs must compete for this protein. Those that do not compete well for eIF4E will not be translated efficiently. This represents one means by which a cell can regulate protein synthesis. One class of mRNA that competes poorly for eIF4E encodes growth-factor proteins. Growth factors are required in small amounts to stimulate cellular growth. Sonenberg has shown that the overproduction of eIF4E in animal cells leads to a reduction in the competition for this protein, and mRNAs such as growth-factor mRNAs that were previously poorly translated when the concentration of eIF4E was low are now translated at a higher rate when eIF4E is abundant. This in turn results in the overproduction of growth factors, which leads to uncontrolled growth, a characteristic typical of cancer cells.
A protein that specifically binds to the poly-A tail at the 3′ end of an mRNA is called the poly-A-binding protein (PABP). Discovered in the 1970’s, the only function of this protein was thought to be to protect the mRNA from attack at its 3′ end by enzymes that degrade RNA. Daniel Gallie demonstrated another function for PABP by showing that the PABP-poly-A-tail complex was required for the function of the eIF4F-cap complex during translation initiation. The idea that a protein located at the 3′ end of an mRNA should participate in events occurring at the opposite end of an mRNA seemed strange initially. However, RNA is quite flexible and is rarely present in a straight, linear form in the cellular environment. Consequently, the poly-A tail can easily approach the cap at the 5′ end. Gallie showed that PABP interacts with eIF4G and eIF4B, two initiation factors that are closely associated with the cap, through
protein-to-protein contacts. The consequence of this interaction is that the 3′ end of an mRNA is held in close physical proximity to its cap. The interaction between these proteins stabilizes their binding to the mRNA, which in turn promotes protein synthesis. Therefore, mRNAs can be thought of as adopting a circular form during translation that looks similar to a snake biting its own tail. This idea is now widely accepted by scientists.
One additional factor, called eIF2, is needed to bring the first tRNA to the 40s subunit. Along with the initiator tRNA (which decodes the AUG codon specifying the amino acid methionine), eIF2 aids the 40s subunit in identifying the AUG initiation. Once the 40s subunit has located the initiation codon, the 60s ribosomal subunit joins the 40s subunit to form the intact 80s ribosome. (Svedberg units are not additive; therefore, a 40s and 60s unit joined together do not make a 100s unit.) This marks the end of the initiation phase of translation.
The Translation Process: Elongation and Termination
During the elongation phase, tRNAs bind to the 80s ribosome as it passes over the codons of the mRNA, and the amino acids attached to the tRNAs are transferred to the growing polypeptide. Binding of the tRNAs to the ribosome is assisted by an accessory protein called eukaryotic elongation factor 1 (eEF1). A codon is decoded by the appropriate tRNA through base pairing between the three nucleotides that make up the codon in the mRNA and three complementary nucleotides within a specific region (called the anticodon) within the tRNA. The tRNA binding sites in the 80s ribosome are located in the 60s subunit. The ribosome moves over the coding region one codon at a time, or in steps of three nucleotides, in a process referred to as “translocation.” When the ribosome moves to the next codon to be decoded, the tRNA containing the appropriate anticodon will bind tightly in the open site in the 60s subunit (the A site). The tRNA that bound to the
previous codon is present in a second site in the 60s subunit (the P site). Once a new tRNA has bound to the A site, the ribosomal RNA itself catalyzes the formation of a peptide bond between the growing polypeptide and the new amino acid. This results in the transfer of the polypeptide attached to the tRNA present in the P site to the amino acid on the tRNA present in the A site. A second elongation factor, eEF2, catalyzes the movement of the ribosome to the next codon to be decoded. This process is repeated one codon at a time until a stop codon is reached.
The termination phase of translation begins when the ribosome reaches one of the three termination or stop codons. These are also referred to as “nonsense” codons as the cell does not produce any tRNAs that can decode them. Accessory factors, called release factors, are also required to assist this stage of translation. They bind to the empty A site in which the stop codon is present, and this triggers the cleavage of the bond between the completed protein from the last tRNA in the P site, thereby releasing the protein. The ribosome then dissociates into its 40s and 60s subunits, the latter of which diffuse away from the mRNA. The close physical proximity of the cap and poly-A tail of an mRNA maintained by the interaction between PABP and the initiation factors (eIF4G and eIF4B) is thought to assist the recycling of the 40s subunit back to the 5′ end of the mRNA to participate in a subsequent round of translation.
Impact and Applications
The elucidation of the process and control of protein synthesis provides a ready means by which scientists can manipulate these processes in cells. In addition to infectious diseases, insufficient dietary protein represents one of the greatest challenges to world health. The majority of people now living are limited to obtaining their dietary protein solely through the consumption of plant matter. Knowledge of the process of protein synthesis may allow molecular biologists to increase the amount of protein in important crop species. Moreover, most plants contain an imbalance in the amino acids needed in the human diet that can lead to disease. For example, protein from corn is poor in the amino acid lysine, whereas the protein from soybeans is poor in methionine and cysteine. Molecular biologists may be able to correct this imbalance by changing the codons present in plant genes, thus improving this source of protein for those people who rely on it for life.
Key terms
amino acid
:
the basic subunit of a protein; there are twenty commonly occurring amino acids, any of which may join together by chemical bonds to form a complex protein molecule
peptide bond
:
the chemical bond between amino acids in protein
polypeptide
:
a linear molecule composed of amino acids joined together by peptide bonds; all proteins are functional polypeptides
RNA
:
ribonucleic acid, that molecule that acts as the messenger between genes in DNA and their protein product, directing the assembly of proteins; as an integral part of ribosomes, RNA is also involved in protein synthesis
translation
:
the process of forming proteins according to instructions contained in an RNA molecule
Bibliography
Atkins, John F., Raymond F. Gesteland, and Thomas Cech. RNA Worlds: From Life's Origins to Diversity in Gene Regulation. Cold Spring Harbor: Cold Spring Harbor Laboratory, 2011. Print.
Crick, Francis. “The Genetic Code III.” Scientific American 215 (1966): 57. Print.
Keiler, Kenneth C. Bacterial Regulatory RNA: Methods and Protocols. New York: Humana, 2012. Print.
Lake, James. “The Ribosome.” Scientific American 245 (1981): 84–97. Print.
Lewin, Benjamin, et al. Genes X. Sudbury: Jones, 2011. Print.
Liljas, Anders, and Måns Ehrenberg. Structural Aspects of Protein Synthesis. 2nd ed. Hackensack: World Scientific, 2013. Print.
Li Puma, Vito, and Carlo Bethaz. New Research on Protein Synthesis. New York: Nova Science, 2014. eBook Collection (EBSCOhost). Web. 26 Aug. 2014.
Rich, Alexander, and Sung Hou Kim. “The Three-Dimensional Structure of Transfer RNA.” Scientific American 238 (1978): 52–62. Print.
Tropp, Burton E., and David Freifelder. Molecular Biology: Genes to Proteins. 4th ed. Sudbury: Jones, 2012. Print.
Whitford, David. “Protein Synthesis, Processing, and Turnover.” Proteins: Structure and Function. Hoboken: Wiley, 2005. Print.