BackGenome Evolution: Genomics, Bioinformatics, and Human Genetic Variation
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Genome Evolution
Genomics & Bioinformatics
Genomics and bioinformatics are fields that study the structure, function, and evolution of genomes using computational and sequencing technologies. These disciplines enable scientists to analyze entire genomes and extract meaningful biological insights.
Genomics: Study of entire genomes, including all genes, sequences, organization, and interactions.
Large-scale sequencing technologies have revolutionized genomics.
Applications:
Understanding gene function, gene expression, mutations, and epigenetics
Medical diagnostics and treatments
Agriculture and crop improvement
Environmental studies (e.g., metagenomics, pollutant monitoring)
Bioinformatics: Use of computational tools to store, analyze, and compare massive biological datasets.
Essential for analyzing genomes that contain billions of nucleotides.
Databases: GenBank & NCBI
GenBank (run by NCBI): Stores DNA/protein sequences; database size doubles every ~18 months.
Provides tools to:
Compare sequences
Examine predicted proteins
View 3D protein structures
Human Genome Project (HGP)
The Human Genome Project (1990–2003) was a landmark international effort to sequence the entire human genome. It involved several key stages:
Linkage Mapping (Genetic Mapping):
Determines the relative order of genetic markers.
Genetic marker: Identifiable DNA sequence.
Constructed using recombination frequencies (markers that recombine more often are farther apart).
Produces a linkage map of gene positions on chromosomes.
Physical Mapping:
Determines the exact number of base pairs between markers.
DNA cut into fragments; fragments ordered by overlapping sequences.
Produces a physical map showing true distances in bp along DNA.
DNA Sequencing:
Determines the complete nucleotide order of the genome.
Uses fluorescent-tagged nucleotides for detection.
Human genome size:
3.2 billion bp (haploid)
6.4 billion bp (diploid)
Sequencing Approaches
Shotgun Sequencing: Skips mapping steps; directly shreds DNA, sequences fragments, and aligns overlaps using computers. Developed by J. Craig Venter.
Still widely used today.
Next-Generation Sequencing (NGS): Automated, rapid sequencing of hundreds of thousands to millions of short fragments per day. Faster, cheaper, and widely used since 2005.
Proteomics
Proteomics is the study of all proteins encoded by a genome. It focuses on protein interactions, function, and folding, and is important for drug development.
Recent breakthrough: AlphaFold (AI capable of accurate protein structure prediction).
Genome Size & Composition
Genome Size
Prokaryotes: 1–6 million base pairs (smaller).
Eukaryotes: Generally much larger (often >100 million bp).
No consistent relationship between genome size and organism complexity.
Human Genome Composition
~25% = protein-coding genes (exons + rRNA/tRNA)
~75% = noncoding DNA
~27% = introns + regulatory sequences
~72% = repetitive noncoding DNA
44% of total genome = transposable elements and related sequences
Many noncoding regions highly conserved, suggesting functional importance.
Transposable Elements (Mobile Genetic Elements)
Transposable elements are DNA segments that move within the genome, also known as "jumping genes." First discovered by Barbara McClintock (Nobel Prize 1983).
Importance:
Drive genomic evolution by causing:
Insertions (disrupt gene function)
Duplications via recombination
Exon shuffling
Regulatory sequence movement
Types of Transposable Elements
Type | Mechanism | Description |
|---|---|---|
Transposons | DNA intermediate | "Cut-and-paste" or "copy-and-paste" movement |
Retrotransposons | RNA intermediate | Transcribed to RNA, reverse transcribed to DNA, inserted elsewhere |
Genomic Comparisons & Evolutionary Trees
Comparing genomes reveals recent and ancient evolutionary events. Evolutionary trees visualize relationships among species or genes.
Example: Human vs chimpanzee: only 1.2% difference in DNA sequence.
Some genes (immune system, brain size, transcription factors) show faster evolution in humans.
Human Genetic Variation
Humans differ by only 0.1% of their genome. Variation arises from several sources:
SNPs (Single Nucleotide Polymorphisms)
Inversions
Duplications/deletions
Short tandem repeats
Transposable elements