Gene Evolution and Phylogenetic Inference in Biological Evolution

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Gene Evolution and Phylogenetic Inference

Introduction

This study guide explores how the history of genes informs our understanding of evolution, focusing on gene trees, phylogenetic relationships, models of nucleotide substitution, and genome evolution. These concepts are central to understanding the evolutionary history of life and are directly relevant to the chapters on Evolution, Phylogeny, DNA Synthesis, and Genomics.

Gene Trees and Species Trees

Gene Trees vs. Species Trees

Gene trees represent the evolutionary relationships among gene copies, while species trees depict the evolutionary relationships among species. These trees may not always match due to processes such as incomplete lineage sorting.

Gene Tree: A diagram showing the inferred evolutionary relationships among various gene copies found in different species.
Species Tree: A diagram showing the evolutionary relationships among a set of species.
Incomplete Lineage Sorting: Occurs when the genealogy of a gene does not match the species tree due to ancestral polymorphism persisting through speciation events.

Example: The BRCA1 gene tree may not perfectly match the phylogenetic relationships among mammals, illustrating the complexity of gene evolution.

Gene tree for BRCA1 and mammalian phylogeny

Gene Tree and Species Tree Concordance

Gene trees can either match or differ from species trees. This is illustrated by coalescence events, which trace the ancestry of gene copies back through time.

Coalescence: The process by which the genealogy of gene copies merges as we trace their ancestry backward in time.
When gene trees equal species trees, the gene history mirrors the species history.
When gene trees do not equal species trees, it is often due to incomplete lineage sorting.

Comparison of gene tree and species tree with coalescence events

Models of Nucleotide Substitution

The Hasegawa Model of Nucleotide Transition

Understanding how nucleotides change over time is crucial for reconstructing evolutionary trees. The Hasegawa model describes the probabilities of transitions (purine to purine or pyrimidine to pyrimidine) and transversions (purine to pyrimidine or vice versa).

Transition: Substitution between two purines (A <-> G) or two pyrimidines (C <-> T).
Transversion: Substitution between a purine and a pyrimidine (A or G <-> C or T).
The Hasegawa model accounts for different rates of transitions and transversions.

Diagram of nucleotide transitions and transversions

Purines and Pyrimidines

Nucleotides are classified as purines or pyrimidines, which is important for understanding mutation patterns in DNA.

Purines: Adenine (A) and Guanine (G)
Pyrimidines: Cytosine (C), Thymine (T, DNA only), and Uracil (U, RNA only)
Transitions are more common than transversions due to chemical similarity.

Structures of purines: adenine and guanine Structures of pyrimidines: cytosine, thymine, uracil

Modern Phylogenetic Tree Reconstruction

Maximum Likelihood and Bayesian Inference

Modern methods for reconstructing evolutionary trees use statistical models to account for nucleotide transition probabilities, providing more accurate phylogenies than simple parsimony methods.

Maximum Likelihood: Estimates the tree that is most likely to have produced the observed genetic data, given a model of evolution.
Bayesian Inference: Uses probability distributions to estimate the most probable tree, incorporating prior information and observed data.
These methods allow for more robust and realistic evolutionary hypotheses.

Applications of Evolutionary Trees

Biological and Clinical Insights

Evolutionary trees have provided significant insights into both biology and medicine, such as tracing the origins of diseases and understanding genome evolution.

Phylogenetic analysis can reveal the evolutionary history of pathogens, aiding in disease control and prevention.
Comparative genomics helps identify conserved and divergent genetic elements across species.

Genome Evolution

Differences in Genome Structure: Bacteria vs. Eukaryotes

Genome evolution varies significantly between bacteria and eukaryotes, particularly in the proportion of non-coding DNA.

Bacteria: Genomes are compact, with most DNA coding for proteins.
Eukaryotes: Tend to accumulate more non-coding sequences, leading to larger and more complex genomes.
The number of protein-coding genes does not always scale with genome size in eukaryotes.

Feature	Bacteria	Eukaryotes
Genome Size	Small (few Mb)	Large (up to several Gb)
Non-coding DNA	Low	High
Gene Density	High	Low

Comparison of genome size and protein-coding gene number in bacteria and eukaryotes

Summary

Understanding gene evolution, phylogenetic inference, and genome structure is essential for interpreting the evolutionary history of life. These concepts are foundational for advanced studies in evolution, genomics, and molecular biology.