Gene Expression II: The Genetic Code and Protein Synthesis – Study Notes

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Gene Expression II: The Genetic Code and Protein Synthesis

Introduction

This chapter explores how genetic information encoded in DNA is translated into proteins, focusing on the genetic code, its experimental discovery, and its properties. Understanding these concepts is fundamental to cell biology and molecular genetics.

The Genetic Code

Definition and Overview

The genetic code is the set of rules by which the sequence of bases in DNA (and its RNA transcript) is converted into the sequence of amino acids in proteins. This code is nearly universal among living organisms.

DNA base sequence determines the linear order of amino acids in proteins.
mRNA acts as the intermediary, carrying instructions from DNA to the ribosome for protein synthesis.

Key Terms

Gene: Functional unit of DNA encoding one or more polypeptides or functional RNAs.
Coding Strand: The DNA strand whose sequence matches the mRNA (except T is replaced by U).
Template Strand: The DNA strand used as a template for mRNA synthesis.
Codon: A sequence of three nucleotides in mRNA that specifies an amino acid.
Alternative Splicing: Process by which different combinations of exons are joined to produce multiple mRNA variants from a single gene.
Frameshift Mutation: Mutation caused by insertion or deletion of nucleotides that shifts the reading frame.

Experimental Evidence for the Genetic Code

Beadle and Tatum: One Gene–One Enzyme Hypothesis

George Beadle and Edward Tatum used the bread mold Neurospora crassa to demonstrate that genes encode enzymes.

Mutations induced by X-rays led to inability to synthesize specific amino acids or vitamins.
Mutants survived only when missing substances were supplied in the medium.
Each mutation disabled a single enzymatic step in a metabolic pathway.
Led to the one gene–one enzyme hypothesis.

Class	Growth on Minimal Medium	Growth on Supplemented Medium
Wild type	Yes	Yes
Mutant (missing enzyme)	No	Yes (if supplement provided)

Ingram, Pauling: One Gene–One Polypeptide Hypothesis

Linus Pauling and Vernon Ingram studied sickle-cell anemia, showing that a single amino acid change in hemoglobin causes disease.

Electrophoresis revealed different migration patterns for normal and sickle-cell hemoglobin.
Trypsin digestion and peptide analysis showed a single amino acid substitution: valine replaces glutamic acid.
Hemoglobin is not an enzyme, leading to the one gene–one polypeptide theory.

Hemoglobin Type	Amino Acid at Position 6	Charge
Normal (Hb-A)	Glutamic acid	Negative
Sickle-cell (Hb-S)	Valine	Neutral

Gene Function Complexity

Alternative Splicing and Functional RNAs

Gene function is more complex than originally thought.

Most eukaryotic genes contain noncoding sequences (introns).
Alternative splicing allows a single gene to produce multiple polypeptides.
Some genes encode functional RNAs (e.g., rRNA, tRNA) rather than proteins.
Genes are defined as functional units of DNA that encode one or more polypeptides or functional RNAs.

The Triplet Nature of the Genetic Code

Triplet Code

The genetic code is composed of three-nucleotide codons.

Four DNA bases (A, T, G, C) and 20 amino acids.
Doublet code (two bases per codon) yields only 16 combinations, insufficient for 20 amino acids.
Triplet code yields 64 possible combinations, more than enough for all amino acids.

Frameshift Mutations

Frameshift mutations provided evidence for the triplet nature of the code.

Proflavin used as a mutagen in bacteriophage T4 induces indel mutations.
Indel mutations shift the reading frame, altering downstream amino acid sequence.
Revertant mutations (second indel near the first) can restore the reading frame, resulting in pseudo wild-type phenotype.

Mutation Type	Effect on Reading Frame	Protein Product
Single indel	Shifted	Abnormal
Two indels	Still shifted	Abnormal
Three indels	Restored	Normal (pseudo wild-type)

Properties of the Genetic Code

Degeneracy and Nonoverlapping Nature

The genetic code has several important properties:

Degenerate code: Most amino acids are specified by more than one codon.
Nonoverlapping: Each nucleotide is part of only one codon; the reading frame advances three nucleotides at a time.

Start and Stop Codons

Of the 64 possible codons, 61 code for amino acids.
AUG is the start codon (methionine).
UAA, UAG, UGA are stop codons, terminating translation.

Codon	Function
AUG	Start (Methionine)
UAA, UAG, UGA	Stop

Unambiguous and Universal Code

Each codon specifies only one amino acid (unambiguous).
The code is nearly universal across all organisms, with rare exceptions (e.g., mitochondria, some bacteria).

Experimental Elucidation of the Genetic Code

Cell-Free Systems and Synthetic RNAs

Marshall Nirenberg and J. Heinrich Matthaei used cell-free systems to decipher the genetic code.

Synthetic RNAs of known sequence were added to cell-free extracts.
Polynucleotide phosphorylase was used to make RNAs of predictable base composition.
Homopolymers (e.g., poly(U)) led to incorporation of specific amino acids (e.g., UUU codes for phenylalanine).
Copolymer experiments and alternating sequence RNAs (Khorana) allowed assignment of codons to amino acids.

RNA Sequence	Incorporated Amino Acid
UUU...	Phenylalanine
AAA...	Lysine
CCC...	Proline

Messenger RNA and Protein Synthesis

Role of mRNA

Messenger RNA (mRNA) guides the synthesis of polypeptide chains by providing the template for translation.

mRNA is transcribed from DNA, using the template strand.
In mRNA synthesis, uracil (U) replaces thymine (T).
Only one DNA strand is copied (template strand); the coding strand matches the mRNA sequence (except T/U).

Example: Coding vs. Template Strand

Strand	Sequence
Coding Strand (DNA)	5'-ATGGGCGGCTCC-3'
Template Strand (DNA)	3'-TACCCGCCGAGG-5'
mRNA	5'-AUGGGCGGCUCC-3'

The mRNA sequence is translated into a polypeptide: Met-Gly-Gly-Ser.

Summary Table: Properties of the Genetic Code

Property	Description
Triplet	Three nucleotides per codon
Degenerate	Most amino acids have multiple codons
Nonoverlapping	Codons are read sequentially, three bases at a time
Unambiguous	Each codon specifies only one amino acid
Nearly Universal	Same code used by most organisms

Key Equations

Number of possible codons:

Conclusion

The genetic code is a fundamental concept in cell biology, linking the information in DNA to the synthesis of proteins. Its triplet, degenerate, and nearly universal nature allows for the faithful translation of genetic information in all living cells.