BackGenome Structure, Mapping, and Genetic Polymorphism: Study Notes for Genetics
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Open Reading Frames, Introns, Exons, and Transcription
Open Reading Frames (ORFs)
An Open Reading Frame (ORF) is a continuous stretch of nucleotides in DNA or RNA that has the potential to code for a protein. It begins with a start codon (usually AUG) and ends with a stop codon (UAA, UAG, or UGA).
Location: ORFs are found in mature mRNA, but can also be identified in pre-mRNA before splicing.
Reading Frames: Each mRNA has three possible reading frames per strand, determined by the position at which translation begins.
Identification: The start of an ORF is marked by a start codon, and the end by a stop codon.
Consequences of Wrong Reading Frame: Translation in the wrong frame can result in nonfunctional or truncated proteins due to premature stop codons.
Example: In the coding region of the lacZ gene, the correct reading frame produces β-galactosidase, while a frameshift mutation can abolish enzyme activity.
Introns and Exons
Exons are coding regions of a gene that remain in the final mRNA after splicing, while introns are non-coding regions that are removed during RNA processing.
Variation Among Organisms: Not all organisms have the same number of introns and exons. Prokaryotes typically lack introns, while eukaryotes (especially higher organisms) have many introns.
Reason for Variation: The complexity of gene regulation and alternative splicing in eukaryotes leads to more introns and exons.
Example: The human β-globin gene contains two introns and three exons, while the corresponding gene in bacteria is uninterrupted.
Transcription and RNA Processing
Transcription is the process by which RNA is synthesized from a DNA template. The initial product is pre-mRNA, which contains both exons and introns.
Steps:
Initiation: RNA polymerase binds to the promoter region.
Elongation: RNA polymerase synthesizes RNA in the 5' to 3' direction.
Termination: Transcription ends at a terminator sequence.
Splicing: Introns are removed, and exons are joined to form mature mRNA.
Genetic Elements in Final mRNA: Only exons and untranslated regions (UTRs) remain; introns are excised.
Example: In eukaryotic cells, the pre-mRNA of the insulin gene undergoes splicing to produce mature mRNA that is translated into insulin protein.
Levels of Gene Expression and Genome Content
Levels of Gene Expression
Genome: The complete set of genes (DNA sequence) in an organism.
Exome: All the exons (coding regions) of a genome.
Transcriptome: The set of all expressed genes, defined by RNA molecules present in a cell or tissue at a specific time.
Proteome: The complete set of proteins produced in a cell or organism.
Interactome: The catalog of all protein-protein interactions.
Epigenome: All the chemical modifications (tags) on chromatin that regulate gene expression.
Example: The human genome contains about 20,000 protein-coding genes, but the proteome is much larger due to alternative splicing and post-translational modifications.
Genome Mapping and Variation
Types of Genome Maps
Linkage Maps: Show genetic distance between loci based on recombination frequencies, not physical distance.
Restriction Maps: Provide physical landmarks on DNA, measured in kilobases, using restriction enzymes (e.g., BamHI, SalI).
Sequencing Maps: Constructed by sequencing and assembling DNA fragments; provide ultimate resolution for gene identification.
Example: The first genetic map for Drosophila melanogaster was based on recombination frequencies; modern sequencing maps allow direct identification of genes.
Linkage Maps and Recombination Frequency
Linkage Maps: Developed by Thomas Hunt Morgan and Alfred Sturtevant; show how often genes are inherited together due to crossing over during meiosis.
Recombination Frequency (RF): The farther apart two genes are, the higher the RF; however, RF is not directly proportional to physical distance.
Equation:
Example: If RF(B-C) = 13.2%, RF(A-B) = 6.4%, then RF(A-C) = 18.5% (not simply additive).
Restriction Maps
Restriction Enzymes: Endonucleases that recognize specific DNA sequences (e.g., BamHI: GGATCC).
Physical Mapping: DNA fragments generated by restriction digestion are measured in kilobases.
Example: A restriction map of a plasmid shows the locations of BamHI and SalI sites, allowing estimation of fragment sizes.
Sequencing Maps
Genome Sequencing: Involves fragmenting DNA, sequencing each piece, and assembling the full sequence.
Gene Identification: Based on known gene structure and sequence features; experimental verification is required.
Example: The Human Genome Project used sequencing maps to identify all protein-coding genes.
Genetic Polymorphism
Definition and Allele Frequency
Genetic Polymorphism: The existence of two or more alleles at a locus, with allele frequency >1% in the population.
Allele Frequency: Calculated as the proportion of a specific allele among all alleles at a locus in a population.
Equation (Hardy-Weinberg):
where and are the frequencies of two alleles.
Example: In a population of 10 people with 20 chromosomes, if 6 are "a" and 14 are "A", then , .
Neutral Polymorphisms vs Deleterious Mutations
Neutral Polymorphisms: Sequence changes that do not affect protein function or organism fitness (e.g., silent mutations, changes in unconserved regions).
Deleterious Mutations: Cause abnormal proteins and disease; usually have low allele frequency (<1%).
Example: Human eye color is a neutral polymorphism; sickle cell anemia is caused by a deleterious mutation.
Distribution of Polymorphisms
Location: Most polymorphisms are found in non-coding regions, such as repetitive DNA, rather than exons.
Reason: Non-coding regions tolerate more variation without affecting organism fitness.
Repetitive DNA and Transposons
Types of Repetitive DNA
Highly Repetitive DNA: Short sequences (<100 bp) repeated thousands of times (e.g., satellite DNA, minisatellites, microsatellites).
Moderately Repetitive DNA: Sequences repeated 10-1000 times; includes transposons and rRNA/tRNA genes.
Example: Short Tandem Repeats (STRs) are used in forensic analysis due to their high variability.
Transposons
Transposons: Mobile genetic elements that can move within the genome and create additional copies of themselves.
Impact: Can promote recombination and chromosomal rearrangements; major source of repetitive DNA.
Example: LINE and SINE elements in the human genome are transposons.
Types and Detection of Polymorphism
Types of Polymorphism
Single Nucleotide Polymorphisms (SNPs): Variation at a single nucleotide position.
Small-scale Insertions/Deletions: Addition or loss of a few nucleotides.
Short Tandem Repeats (STRs): 1-6 nucleotides repeated 5-50 times.
Transposons: Account for 45% of human genome.
Detection of Polymorphisms
Sequencing: Direct determination of DNA sequence.
Non-sequencing Approaches: Use restriction enzyme digestion or PCR to detect length differences in DNA fragments (e.g., STR analysis).
Example: Agarose gel electrophoresis separates DNA fragments by size after restriction digestion.
Applications of Polymorphisms
Parentage and Forensics
Parentage Analysis: STR profiles are used to match children to parents.
Forensics: DNA profiles from crime scenes are matched to suspects using polymorphism data.
Mapping Disease-Associated Mutations
Genome-Wide Association Studies (GWAS): SNPs are screened across genomes of patients and controls to identify disease-associated variants.
Example: SNPs associated with heart disease or schizophrenia are identified by comparing patient and nonpatient DNA.
Genome Size and Organization
Genome Size Variation
Definition: Total amount of DNA in the haploid genome.
Variation: Genome sizes range from 106 bp in bacteria to 3.3 x 109 bp in humans.
Phylum | Species | Genome (bp) |
|---|---|---|
Algae | Pyramonas salina | 6.6 x 106 |
Bacterium | E. coli | 4.2 x 106 |
Yeast | S. cerevisiae | 1.3 x 107 |
Nematode | C. elegans | 9.7 x 107 |
Insect | D. melanogaster | 1.4 x 108 |
Mammal | H. sapiens | 3.3 x 109 |
Repetitive DNA and Genome Complexity
Repetitive DNA: Larger genomes contain more repetitive DNA, not necessarily more genes.
Transposons: Major source of repetitive DNA in eukaryotic genomes.
Organelle Genomes and Endosymbiosis
Non-nuclear DNA and Inheritance
Non-Mendelian Inheritance: Traits controlled by organelle genomes (mitochondria, chloroplasts) are inherited maternally.
Organelle DNA: Evolves at a different rate than nuclear DNA; usually no recombination between organelle genomes.
Example: Mitochondrial DNA analysis shows all humans descended from a single woman in Africa ~200,000 years ago.
Mitochondrial Genomes (mtDNA)
Structure: Small, circular molecules resembling bacterial genomes; several copies per organelle.
Variation: Number of protein-coding and RNA-coding genes varies by species.
Species | Size (kb) | Protein-Coding Genes | RNA-Coding Genes |
|---|---|---|---|
Fungi | 12-100 | 8-14 | 20-30 |
Plants | 60-250 | 25-42 | 20-30 |
Animals | 16-17 | 13 | 24 |
Genes in Mitochondria
Respiratory Complexes: mtDNA encodes components of the electron transport chain.
Protein Synthesis Machinery: mtDNA encodes rRNAs and some tRNAs.
Import of Proteins: Many mitochondrial proteins are encoded by nuclear DNA and imported into the organelle.
Endosymbiosis and Organelle Evolution
Endosymbiosis Theory: Mitochondria and chloroplasts originated from free-living bacteria engulfed by ancestral eukaryotic cells.
Genome Reduction: Organelle genomes have lost many genes not necessary for independent life.
Protein Transfer: Proteins encoded by nuclear genes require special targeting sequences for import into organelles.
Example: Sequence comparisons between mtDNA and bacterial DNA support a common evolutionary origin.