Skip to main content
Back

Transcription and RNA Processing: From DNA to Protein

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Transcription and RNA Processing

Introduction to Transcription

Transcription is the process by which genetic information encoded in DNA is copied into RNA, serving as the first step in gene expression. In humans, the genome consists of approximately 3.1 billion base pairs and about 46,000 genes, but less than 1.5% of DNA codes for proteins. Identifying genes within this vast genome is a key challenge in molecular biology.

  • Gene: A short stretch of DNA on a chromosome, typically composed of a regulatory region (upstream) and a coding region (downstream).

  • Stages of gene expression:

    1. Transcription

    2. Translation

What is a Gene?

A gene consists of two main parts: the regulatory region, which controls gene expression, and the coding region, which contains the information to make proteins. The regulatory region includes binding sites for RNA polymerase, which initiates transcription.

  • Regulatory region: Upstream DNA sequence that controls transcription initiation.

  • Coding region: Downstream DNA sequence that is transcribed into mRNA.

Transcription Overview

Transcription is similar to DNA replication but differs in several key aspects. Only one DNA strand (the template strand) is copied, and the product is RNA, not DNA. RNA polymerase is the enzyme responsible for synthesizing RNA.

  • Template strand: The DNA strand used to synthesize RNA.

  • RNA polymerase: Enzyme that synthesizes RNA from the DNA template.

  • Product: Messenger RNA (mRNA), which carries genetic information to ribosomes for protein synthesis.

Transcription Example

Given a DNA sequence, the corresponding RNA sequence is produced by complementary base pairing, replacing thymine (T) with uracil (U).

  • DNA: 3'-ATCGGCAGGACCTTAAAT-5'

  • RNA: 5'-UAGCCGUC CUGGAAUUUA-3'

Genes and Disease: Garrod's Observations

In 1902, Archibald Garrod observed that certain diseases are hereditary and result from defective enzymes. This led to the hypothesis that DNA encodes information for enzymes, as seen in Maple Syrup Urine Disease.

  • Maple Syrup Urine Disease: A genetic disorder more prevalent in certain populations due to defective enzyme production.

Beadle and Tatum (1941): One Gene-One Enzyme Hypothesis

Beadle and Tatum's experiments with Neurospora crassa demonstrated that each gene codes for a specific enzyme. Mutations in genes disrupt enzyme production, affecting metabolic pathways.

  • Wild type: Grows on minimal media by synthesizing all necessary molecules.

  • Mutant: Requires supplementation (e.g., leucine) due to defective enzyme.

Central Dogma of Molecular Biology

The central dogma describes the flow of genetic information: DNA is transcribed into RNA, which is then translated into protein. This process is fundamental to cell biology.

  • Replication: DNA is copied to produce more DNA.

  • Transcription: DNA is copied into RNA.

  • Translation: RNA is used to synthesize proteins.

Equation:

Translation: From Nucleotides to Amino Acids

Translation is the process by which the sequence of nucleotides in mRNA is converted into a sequence of amino acids, forming a protein. Each set of three nucleotides (codon) specifies one amino acid.

  • DNA/RNA bases: Adenine (A), Cytosine (C), Guanine (G), Thymine (T, DNA only), Uracil (U, RNA only)

  • Protein amino acids: Alanine, Arginine, Asparagine, Aspartate, Cysteine, Glutamine, Glutamate, Glycine, Histidine, Isoleucine, Leucine, Lysine, Methionine, Phenylalanine, Proline, Serine, Threonine, Tryptophan, Tyrosine, Valine

Codons and the Genetic Code

A codon is a unit of information in mRNA that encodes one amino acid. There are 20 amino acids and 4 nucleotide bases (A, U, G, C). Codons are three nucleotides long, allowing for 64 possible combinations. The code is degenerate, meaning multiple codons can specify the same amino acid.

  • Number of possible codons:

  • Start codon: AUG (Methionine)

  • Stop codons: UAA, UAG, UGA

Codon Table

First Base

Second Base

Third Base

Amino Acid

U

U

U

Phe (F)

A

U

G

Met (M) - Start

U

A

A

Stop

U

A

G

Stop

U

G

A

Stop

G

G

G

Gly (G)

C

A

G

Gln (Q)

A

A

A

Lys (K)

G

A

A

Glu (E)

Reading Frames

The reading frame determines how nucleotides are grouped into codons. There are three possible reading frames for any mRNA sequence, and the correct frame is set by the start codon. Frameshift mutations can alter the reading frame, resulting in abnormal proteins.

  • Reading frame-1: 5' CAC GGU CGA UGA GGU UAC AUA AC 3' (His Gly Arg STOP)

  • Reading frame-2: 5' C ACG GUC GAU GAG GUU ACA UAA C 3' (Thr Leu His Glu Val Thr STOP)

  • Reading frame-3: 5' CA CGG UCG AUG AGG UUA CAU AAC 3' (Met Arg Leu His Thr)

Crick and Brenner: Codon Structure

Crick and Brenner's experiments demonstrated that codons are read in a continuous sequence without spaces. Frameshift mutations, caused by nucleotide insertions or deletions, disrupt the reading frame and alter the resulting protein.

  • Frameshift mutation: Alters the reading frame and protein sequence.

Nirenberg (1961): Deciphering the Genetic Code

Nirenberg's experiments used synthetic mRNA to determine which amino acids are encoded by specific codons. For example, UUU codes for phenylalanine, and codons such as UAA, UAG, and UGA are stop codons that terminate translation.

  • Example: encodes polyphenylalanine.

  • Stop codons: UAA, UAG, UGA (no protein produced)

Transcription Process in Prokaryotes

Prokaryotic transcription involves RNA polymerase, which consists of a core enzyme (α, α, β, β') and a sigma (σ) factor that enables DNA binding. Unlike DNA polymerase, RNA polymerase does not require a primer to initiate synthesis.

  • Promoter: DNA sequence where RNA polymerase binds to initiate transcription (-35, -10, +1 regions).

  • Elongation: RNA polymerase synthesizes RNA in the transcription bubble, adding nucleotides to the 3' end.

  • Termination: Transcription ends via mechanisms such as hairpin formation, causing RNA polymerase to dissociate from DNA.

Eukaryotic Differences in Transcription

Eukaryotes have three specialized RNA polymerases and more complex transcription initiation, involving multiple transcription factors. Transcription occurs in the nucleus, while translation occurs in the cytoplasm. Eukaryotic mRNA undergoes extensive processing before translation.

  • RNA polymerases: I, II, III (each transcribes different types of RNA)

  • Transcription factors: Proteins that assist RNA polymerase binding to the promoter.

  • Spatial separation: Transcription in nucleus, translation in cytoplasm.

Eukaryotic mRNA Structure and Processing

Eukaryotic mRNA is modified after transcription to increase stability and facilitate translation. Modifications include a 5' cap and a 3' poly-A tail.

  • 5' cap: Addition of a methylated guanine nucleotide to the 5' end; prevents degradation.

  • 3' poly-A tail: Addition of 100-200 adenine nucleotides to the 3' end; increases transcript stability.

Primary vs. Mature Transcripts

The primary transcript (pre-mRNA) contains both exons (coding regions) and introns (non-coding regions). Introns are removed by the spliceosome to produce mature mRNA.

  • Exons: Protein-coding sequences retained in mature mRNA.

  • Introns: Non-coding sequences removed during RNA processing.

mRNA Splicing

Splicing is carried out by the spliceosome, a complex of small nuclear ribonucleoproteins (snRNPs) and proteins. The spliceosome recognizes specific sequences at the intron-exon boundaries and removes introns, joining exons together.

  • Spliceosome: snRNPs + protein complex that mediates splicing.

  • Recognition sites: GU at 5' end, AG at 3' end, and branch point A in the middle of the intron.

  • Process:

    1. Spliceosome cuts 5' of intron and branch point (A).

    2. 5' end of intron forms a lariat structure.

    3. 3' of exon 1 displaces 3' of intron; exons join, intron excised.

Example: Lactase (LCT) Mutation

Lactase persistence is an example of a mutation affecting gene regulation. Mutations in the regulatory region of the LCT gene allow continued expression of lactase, enabling adults to digest lactose. This trait spread in populations practicing cattle herding.

  • Lactase persistence: Mutation in regulatory region leads to continued lactase production in adults.

  • Evolutionary significance: Provided a selective advantage in populations with dairy-based diets.

Summary Table: Key Differences in Transcription

Feature

Prokaryotes

Eukaryotes

RNA Polymerase

Single type

Three types (I, II, III)

Initiation

Sigma factor, simple promoter

Multiple transcription factors, complex promoter

Location

Cytoplasm

Nucleus

RNA Processing

Minimal

5' cap, poly-A tail, splicing

Additional info:

  • Frameshift mutations can have severe effects on protein function due to altered reading frames.

  • Splicing allows for alternative mRNA products from a single gene, increasing protein diversity.

Pearson Logo

Study Prep