Gene Expression II: The Genetic Code and Protein Synthesis

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Gene Expression II: The Genetic Code and Protein Synthesis

Introduction

This topic explores how genetic information encoded in DNA is translated into proteins, focusing on the genetic code, its experimental discovery, and its properties. Understanding these principles is fundamental to cell biology and molecular genetics.

The Genetic Code

Definition and Relationship

The genetic code is the set of rules by which the sequence of bases in DNA is translated into the sequence of amino acids in proteins.

DNA base sequence determines the linear order of amino acids in protein products.
mRNA acts as the intermediary, carrying instructions from DNA to the ribosome for protein synthesis.

Key Terms

Gene: Functional unit of DNA that encodes one or more polypeptides or functional RNAs.
Coding strand: The DNA strand whose sequence matches the mRNA (except T is replaced by U).
Template strand: The DNA strand that is copied during mRNA synthesis.
Codon: A sequence of three nucleotides in mRNA that specifies an amino acid.
Alternative splicing: Process by which different combinations of exons are joined to produce multiple mRNA variants from a single gene.
Frameshift mutation: Mutation caused by insertion or deletion of nucleotides that shifts the reading frame.

Experimental Evidence for the Genetic Code

Beadle and Tatum: One Gene-One Enzyme Hypothesis

George Beadle and Edward Tatum used the bread mold Neurospora crassa to demonstrate the link between genes and enzymes.

Mutations induced by X-rays led to loss of ability to synthesize specific amino acids or vitamins.
Mutants survived only when missing substances were provided in the medium.
Each mutation disabled a single enzymatic step in a metabolic pathway.
Formulated the one gene-one enzyme hypothesis.

Beadle and Tatum Experimental Design (Summary Table)

Mutant Class	Growth on Minimal Medium	Growth on Supplemented Medium	Inference
Wild type	Yes	Yes	No mutation
Class I	No	Yes (with supplement)	Mutation blocks synthesis of specific compound
Class II	No	Yes (with different supplement)	Mutation blocks different step

Ingram: One Gene-One Polypeptide Hypothesis

Linus Pauling and Vernon Ingram studied sickle-cell anemia.
Electrophoresis showed hemoglobin from sickle cells migrated differently than normal.
Trypsin digestion and peptide analysis revealed a single amino acid change: valine replaces glutamic acid in sickle-cell hemoglobin.
Led to refinement: one gene-one polypeptide theory.

Normal vs. Sickle-Cell Hemoglobin (Summary Table)

Hemoglobin Type	Amino Acid at Position 6	Charge
Normal (HbA)	Glutamic acid	Negative
Sickle-cell (HbS)	Valine	Neutral

Properties of the Genetic Code

The Triplet Code

Four DNA bases (A, T, G, C) and 20 amino acids.
Doublet code (2 bases per codon) yields only 16 combinations—insufficient.
Triplet code (3 bases per codon) yields 64 possible combinations—more than enough for all amino acids.
Each codon consists of three nucleotides.

Frameshift Mutations and Evidence for Triplet Code

Crick, Brenner, and others used proflavin to induce indel mutations in bacteriophage T4.
Frameshift mutations shift the reading frame, altering downstream amino acid sequence.
Revertant mutations (second indel near the first) restore the reading frame, producing pseudo wild-type phenotype.
Adding/removing three nucleotides does not alter the reading frame, supporting the triplet nature of the code.

Degeneracy and Nonoverlapping Nature

64 codons, but only 20 amino acids.
Degenerate code: Most amino acids are specified by more than one codon.
Nonoverlapping code: Each nucleotide is part of only one codon; reading frame advances three nucleotides at a time.

Degeneracy Table (Sample)

Amino Acid	Codons
Leucine	UUA, UUG, CUU, CUC, CUA, CUG
Serine	UCU, UCC, UCA, UCG, AGU, AGC

Messenger RNA and Protein Synthesis

Role of mRNA

Genetic code refers to the order of nucleotides in mRNA that direct protein synthesis.
mRNA is transcribed from DNA, serving as a template for translation.

Differences Between mRNA Synthesis and DNA Replication

In mRNA synthesis, only one DNA strand is copied—the template strand.
The other strand is the coding strand, which matches the mRNA sequence (except T is replaced by U).
In mRNA, uracil (U) replaces thymine (T).

Example: Coding vs. Template Strand

Strand	Sequence
Coding (DNA)	5'-ATGGGCGGC-3'
Template (DNA)	3'-TACCCGCCG-5'
mRNA	5'-AUGGGCGGC-3'

Experimental Elucidation of the Genetic Code

Cell-Free Systems

Marshall Nirenberg and J. Heinrich Matthaei used cell-free systems to study protein synthesis.
Synthetic RNAs of known sequence were added to bacterial extracts.
Polynucleotide phosphorylase was used to make synthetic RNA molecules.

Homopolymers and Copolymers

Homopolymer: RNA made from a single type of nucleotide (e.g., poly(U)).
Poly(U) directed incorporation of phenylalanine, showing UUU codes for phenylalanine.
Further experiments showed AAA codes for lysine, CCC for proline.
Copolymers (mixtures of two nucleotides) helped assign codons to amino acids.

Assignment of Codons

H. Gobind Khorana synthesized RNAs with alternating sequences to narrow codon assignments.
Eventually, all 64 codons were assigned to specific amino acids or stop signals.

Codon Table (Summary)

Codon	Amino Acid
UUU, UUC	Phe
AUG	Met (Start)
UAA, UAG, UGA	Stop
AAA	Lys
CCC	Pro
... (see full codon table for all assignments)	...

Properties of the Genetic Code

Unambiguous and Degenerate

Each codon specifies only one amino acid (unambiguous).
Many amino acids are specified by multiple codons (degenerate).
Mutations in codons often change the amino acid sequence.

Universality of the Genetic Code

The genetic code is nearly universal among all organisms.
Exceptions exist in mitochondria and some bacteria, where codon assignments differ.

Key Equations and Concepts

Number of Possible Codons

Number of possible codons: $4^3 = 64$

Central Dogma of Molecular Biology

DNA → RNA → Protein

Summary Table: Properties of the Genetic Code

Property	Description
Triplet	Three nucleotides per codon
Degenerate	Multiple codons for most amino acids
Nonoverlapping	Each nucleotide is part of only one codon
Unambiguous	Each codon specifies only one amino acid
Nearly Universal	Same code used by most organisms

Conclusion

The genetic code is a fundamental principle of molecular biology, linking DNA sequence to protein structure and function. Its discovery and characterization were achieved through classic experiments and remain central to understanding gene expression in all cells.