BackThe Interrupted Gene: Structure, Function, and Variation in Eukaryotic Genomes
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Lecture 4: The Interrupted Gene
Defining the Anatomy of a Gene
Genes are fundamental units of heredity, and their structure determines how genetic information is expressed. The anatomy of a gene includes several distinct regions, each with specific functions in transcription and regulation.
Promoter/Enhancer Sequences: Regions where transcription factors and RNA polymerase bind to initiate transcription. The TATA box is a common promoter element.
5' UTR (Untranslated Region): Sequence upstream of the start codon; contains signals for transcription initiation but is not translated into protein.
Gene Sequence: The region between the start and stop codons; encodes the protein product. Start and stop codons define the boundaries for translation.
3' UTR: Sequence downstream of the stop codon; contains signals for polyadenylation and transcription termination.
Mutations in Non-Coding Parts of Genes
Mutations can occur in both coding and non-coding regions of genes, affecting gene expression and protein production.
Regulatory Site Mutations: Changes in control sites or promoters can dramatically alter gene expression.
LOF (Loss of Function): Mutation results in little or no protein produced.
GOF (Gain of Function): Mutation leads to excessive protein production.
Example: Mutations in the promoter region may prevent RNA polymerase binding, silencing the gene.
The Structure of the Gene: Prokaryotic vs Eukaryotic Genes
Gene organization differs significantly between prokaryotes and eukaryotes, influencing transcription and gene regulation.
Prokaryotic Genes:
Bacterial mRNAs and genes are usually colinear (no introns).
RNA polymerase reads the template strand from 3' to 5'.
mRNA is synthesized 5' to 3', substituting uracil (U) for thymine (T).
Eukaryotic Genes:
Often interrupted by introns (non-coding regions).
Require splicing to produce mature mRNA.
Interrupted Genes and Introns in Eukaryotes
Most eukaryotic genes are interrupted by introns, which must be removed during RNA processing.
Interrupted Gene: A gene with non-continuous coding sequence due to introns.
Primary (RNA) Transcript: The initial unmodified RNA product containing 5' UTR, exons, introns, and 3' UTR.
RNA Splicing: The process of excising introns and joining exons to form mature mRNA.
Mature Transcript: Modified RNA with introns removed and alterations at the 5' and 3' ends.
Intron/Exons and the Nature of the Gene
Exons and introns are fundamental to gene structure and function in eukaryotes.
Exons: Coding stretches of DNA that remain in mature mRNA.
Introns: Non-coding stretches that are spliced out during RNA processing.
Exons are not jumbled during splicing; their order is maintained.
Exon splicing is allele-specific; exons from one allele are not joined to those from another.
Organization and Conservation of Interrupted Genes
The arrangement of exons and introns can be conserved across species, though intron length may vary.
Positions of introns are usually conserved in homologous genes between organisms.
Intron lengths can vary greatly, but introns rarely encode proteins.
Exon Sequences Under Negative Selection
Exon sequences are often conserved due to selective pressure, while introns evolve more rapidly.
Comparative studies show exons are conserved, but intron sequences differ significantly between species.
Negative selection maintains exon sequence to ensure functional protein production.
Gene Length Variation Across Organisms
Gene length and exon/intron structure vary among organisms, reflecting evolutionary and functional diversity.
Yeast has few interrupted genes; mammals have many, with variable exon numbers per gene.
Most exons code for 30-60 amino acids.
Gene length is primarily determined by intron size and number.
Distribution of Gene Sizes
Genes show a wide range of sizes, mainly due to variation in intron size and number.
Exons are usually short, encoding fewer than 100 amino acids.
Intron sizes vary widely depending on the organism.
Overall gene length is determined mostly by intron size.
A Gene is a Transcription Unit
A gene contains all the information required for its expression, including coding and regulatory regions.
Can produce more than one protein via alternative controlling elements.
Different start sites and junctions can be used.
Sometimes, the open reading frame (ORF) of one protein is entirely within another, leading to related functions.
Open Reading Frames (ORFs)
An open reading frame (ORF) is a sequence of DNA or RNA that can be translated into a protein, beginning with a start codon and ending with a stop codon.
ORFs are found in both mRNA and pre-mRNA.
Each mRNA has three possible reading frames, depending on where translation starts.
Start codon (AUG) and stop codons (UAA, UAG, UGA) define the boundaries of an ORF.
Incorrect reading frames can result in nonfunctional or truncated proteins.
Alternative Splicing and Protein Diversity
Alternative splicing allows a single gene to produce multiple protein isoforms, increasing functional diversity.
Related proteins are produced if the reading frame is maintained.
Unrelated proteins can result if splicing alters the reading frame.
Proteins often share domains due to alternative splicing.
Exons and Protein Domains
Exons often correspond to protein domains, which are subfunctions of proteins that fold independently.
Typical domain: 30-60 amino acids, encoded by 90-180 nucleotides.
DNA for exons can be moved by transposable elements, explaining the modular nature of proteins.
Insertion of exons into introns can produce proteins with new domains.
Forms of Information in DNA
DNA encodes various types of genetic information beyond conventional phenotypes.
Includes information for genome structure, regulatory elements, and positional cues in development.
Horizontal gene transfer can introduce new sequences into introns or intergenic regions.
Some transferred sequences may be involved in intracellular recognition.
Breakout Questions
Define ORF: An open reading frame is a stretch of nucleotides that can be translated into a protein, starting with a start codon and ending with a stop codon.
Location of ORF: Found in both mRNA and pre-mRNA.
Number of Reading Frames: Three possible reading frames per mRNA strand.
Start/Stop Codons: Start codon is AUG; stop codons are UAA, UAG, UGA.
Introns and Exons: Exons are coding regions; introns are non-coding. Not all organisms have the same number—yeast has few, mammals have many.
Transcription Process:
Transcription begins at the promoter.
Primary transcript includes exons, introns, UTRs.
Splicing removes introns, joining exons.
Final mRNA contains exons and UTRs, ready for translation.
Summary Table: Gene Structure Elements
Region | Function | Found in |
|---|---|---|
Promoter/Enhancer | Transcription initiation | DNA |
5' UTR | Transcription regulation | Pre-mRNA, mRNA |
Exon | Coding sequence | Pre-mRNA, mRNA |
Intron | Non-coding, spliced out | Pre-mRNA |
3' UTR | Transcription termination | Pre-mRNA, mRNA |
Key Equations and Concepts
Number of Reading Frames: $3$ (for a single-stranded mRNA)
Central Dogma:
Start Codon:
Stop Codons:
Additional info: The notes expand on the modular nature of genes and proteins, the evolutionary conservation of exons, and the functional implications of alternative splicing and gene structure diversity across organisms.