BackTranscription and RNA Processing: From DNA to Protein
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Transcription and RNA Processing
Introduction to Transcription
Transcription is the process by which genetic information encoded in DNA is copied into RNA, serving as the first step in gene expression. In humans, the genome consists of approximately 3.1 billion base pairs and about 46,000 genes, but less than 1.5% of DNA codes for proteins. Identifying genes within this vast genome is a key challenge in molecular biology.
Gene: A short stretch of DNA on a chromosome, typically composed of a regulatory region (upstream) and a coding region (downstream).
Stages of gene expression:
Transcription
Translation
What is a Gene?
A gene consists of two main parts: the regulatory region, which controls gene expression, and the coding region, which contains the information to make proteins. The regulatory region includes binding sites for RNA polymerase, which initiates transcription.
Regulatory region: Upstream DNA sequence that controls transcription initiation.
Coding region: Downstream DNA sequence that is transcribed into mRNA.
Transcription Overview
Transcription is similar to DNA replication but differs in several key aspects. Only one DNA strand (the template strand) is copied, and the product is RNA, not DNA. RNA polymerase is the enzyme responsible for synthesizing RNA.
Template strand: The DNA strand used to synthesize RNA.
RNA polymerase: Enzyme that synthesizes RNA from the DNA template.
Product: Messenger RNA (mRNA), which carries genetic information to ribosomes for protein synthesis.
Transcription Example
Given a DNA sequence, the corresponding RNA sequence is produced by complementary base pairing, replacing thymine (T) with uracil (U).
DNA: 3'-ATCGGCAGGACCTTAAAT-5'
RNA: 5'-UAGCCGUC CUGGAAUUUA-3'
Genes and Disease: Garrod's Observations
In 1902, Archibald Garrod observed that certain diseases are hereditary and result from defective enzymes. This led to the hypothesis that DNA encodes information for enzymes, as seen in Maple Syrup Urine Disease.
Maple Syrup Urine Disease: A genetic disorder more prevalent in certain populations due to defective enzyme production.
Beadle and Tatum (1941): One Gene-One Enzyme Hypothesis
Beadle and Tatum's experiments with Neurospora crassa demonstrated that each gene codes for a specific enzyme. Mutations in genes disrupt enzyme production, affecting metabolic pathways.
Wild type: Grows on minimal media by synthesizing all necessary molecules.
Mutant: Requires supplementation (e.g., leucine) due to defective enzyme.
Central Dogma of Molecular Biology
The central dogma describes the flow of genetic information: DNA is transcribed into RNA, which is then translated into protein. This process is fundamental to cell biology.
Replication: DNA is copied to produce more DNA.
Transcription: DNA is copied into RNA.
Translation: RNA is used to synthesize proteins.
Equation:
Translation: From Nucleotides to Amino Acids
Translation is the process by which the sequence of nucleotides in mRNA is converted into a sequence of amino acids, forming a protein. Each set of three nucleotides (codon) specifies one amino acid.
DNA/RNA bases: Adenine (A), Cytosine (C), Guanine (G), Thymine (T, DNA only), Uracil (U, RNA only)
Protein amino acids: Alanine, Arginine, Asparagine, Aspartate, Cysteine, Glutamine, Glutamate, Glycine, Histidine, Isoleucine, Leucine, Lysine, Methionine, Phenylalanine, Proline, Serine, Threonine, Tryptophan, Tyrosine, Valine
Codons and the Genetic Code
A codon is a unit of information in mRNA that encodes one amino acid. There are 20 amino acids and 4 nucleotide bases (A, U, G, C). Codons are three nucleotides long, allowing for 64 possible combinations. The code is degenerate, meaning multiple codons can specify the same amino acid.
Number of possible codons:
Start codon: AUG (Methionine)
Stop codons: UAA, UAG, UGA
Codon Table
First Base | Second Base | Third Base | Amino Acid |
|---|---|---|---|
U | U | U | Phe (F) |
A | U | G | Met (M) - Start |
U | A | A | Stop |
U | A | G | Stop |
U | G | A | Stop |
G | G | G | Gly (G) |
C | A | G | Gln (Q) |
A | A | A | Lys (K) |
G | A | A | Glu (E) |
Reading Frames
The reading frame determines how nucleotides are grouped into codons. There are three possible reading frames for any mRNA sequence, and the correct frame is set by the start codon. Frameshift mutations can alter the reading frame, resulting in abnormal proteins.
Reading frame-1: 5' CAC GGU CGA UGA GGU UAC AUA AC 3' (His Gly Arg STOP)
Reading frame-2: 5' C ACG GUC GAU GAG GUU ACA UAA C 3' (Thr Leu His Glu Val Thr STOP)
Reading frame-3: 5' CA CGG UCG AUG AGG UUA CAU AAC 3' (Met Arg Leu His Thr)
Crick and Brenner: Codon Structure
Crick and Brenner's experiments demonstrated that codons are read in a continuous sequence without spaces. Frameshift mutations, caused by nucleotide insertions or deletions, disrupt the reading frame and alter the resulting protein.
Frameshift mutation: Alters the reading frame and protein sequence.
Nirenberg (1961): Deciphering the Genetic Code
Nirenberg's experiments used synthetic mRNA to determine which amino acids are encoded by specific codons. For example, UUU codes for phenylalanine, and codons such as UAA, UAG, and UGA are stop codons that terminate translation.
Example: encodes polyphenylalanine.
Stop codons: UAA, UAG, UGA (no protein produced)
Transcription Process in Prokaryotes
Prokaryotic transcription involves RNA polymerase, which consists of a core enzyme (α, α, β, β') and a sigma (σ) factor that enables DNA binding. Unlike DNA polymerase, RNA polymerase does not require a primer to initiate synthesis.
Promoter: DNA sequence where RNA polymerase binds to initiate transcription (-35, -10, +1 regions).
Elongation: RNA polymerase synthesizes RNA in the transcription bubble, adding nucleotides to the 3' end.
Termination: Transcription ends via mechanisms such as hairpin formation, causing RNA polymerase to dissociate from DNA.
Eukaryotic Differences in Transcription
Eukaryotes have three specialized RNA polymerases and more complex transcription initiation, involving multiple transcription factors. Transcription occurs in the nucleus, while translation occurs in the cytoplasm. Eukaryotic mRNA undergoes extensive processing before translation.
RNA polymerases: I, II, III (each transcribes different types of RNA)
Transcription factors: Proteins that assist RNA polymerase binding to the promoter.
Spatial separation: Transcription in nucleus, translation in cytoplasm.
Eukaryotic mRNA Structure and Processing
Eukaryotic mRNA is modified after transcription to increase stability and facilitate translation. Modifications include a 5' cap and a 3' poly-A tail.
5' cap: Addition of a methylated guanine nucleotide to the 5' end; prevents degradation.
3' poly-A tail: Addition of 100-200 adenine nucleotides to the 3' end; increases transcript stability.
Primary vs. Mature Transcripts
The primary transcript (pre-mRNA) contains both exons (coding regions) and introns (non-coding regions). Introns are removed by the spliceosome to produce mature mRNA.
Exons: Protein-coding sequences retained in mature mRNA.
Introns: Non-coding sequences removed during RNA processing.
mRNA Splicing
Splicing is carried out by the spliceosome, a complex of small nuclear ribonucleoproteins (snRNPs) and proteins. The spliceosome recognizes specific sequences at the intron-exon boundaries and removes introns, joining exons together.
Spliceosome: snRNPs + protein complex that mediates splicing.
Recognition sites: GU at 5' end, AG at 3' end, and branch point A in the middle of the intron.
Process:
Spliceosome cuts 5' of intron and branch point (A).
5' end of intron forms a lariat structure.
3' of exon 1 displaces 3' of intron; exons join, intron excised.
Example: Lactase (LCT) Mutation
Lactase persistence is an example of a mutation affecting gene regulation. Mutations in the regulatory region of the LCT gene allow continued expression of lactase, enabling adults to digest lactose. This trait spread in populations practicing cattle herding.
Lactase persistence: Mutation in regulatory region leads to continued lactase production in adults.
Evolutionary significance: Provided a selective advantage in populations with dairy-based diets.
Summary Table: Key Differences in Transcription
Feature | Prokaryotes | Eukaryotes |
|---|---|---|
RNA Polymerase | Single type | Three types (I, II, III) |
Initiation | Sigma factor, simple promoter | Multiple transcription factors, complex promoter |
Location | Cytoplasm | Nucleus |
RNA Processing | Minimal | 5' cap, poly-A tail, splicing |
Additional info:
Frameshift mutations can have severe effects on protein function due to altered reading frames.
Splicing allows for alternative mRNA products from a single gene, increasing protein diversity.