Skip to main content
Back

Microbial Systems Biology and Genome Evolution: Study Notes

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Microbial Systems Biology and Genomics

Introduction to Systems Biology

Systems biology is an integrative field that seeks to understand the complex interactions within biological systems, moving beyond the study of individual pathways to analyze networks of genes, proteins, and metabolites. This approach provides a holistic view of how organisms respond to their environment and adapt to changing conditions.

  • Systems biology uses data from genomics, transcriptomics, proteomics, and metabolomics to map and model biological processes.

  • Traditional microbiology focused on single pathways, but systems biology reveals the dynamic, interconnected nature of cellular functions.

  • Genomics is foundational, enabling the use of other 'omics' approaches and providing targets for drug and vaccine development.

  • Applications include monitoring disease outbreaks, discovering uncultured organisms, and identifying virulence factors.

Applications of genomics in systems biology

Genomics: Sequencing, Assembly, and Annotation

Genome Sequencing

Genome sequencing determines the precise order of nucleotides in DNA or RNA. Modern sequencing technologies have revolutionized microbiology by enabling rapid and comprehensive analysis of microbial genomes.

  • Sequencing: Determining the nucleotide sequence of DNA fragments.

  • Genome assembly: Piecing together short DNA sequences into longer, continuous sequences (contigs and scaffolds).

  • Genome annotation: Identifying genes and functional elements within the assembled genome.

  • Bioinformatics: Computational analysis of sequence data to predict gene function and structure.

Sanger Sequencing and Next-Generation Sequencing

The Sanger method, developed by Fred Sanger, was the first widely used DNA sequencing technique. It relies on chain-terminating dideoxynucleotides (ddNTPs) to generate DNA fragments of varying lengths, which are then separated and analyzed to determine the sequence.

  • Sanger sequencing is limited to ~800 nucleotides per reaction and requires "primer walking" for larger genomes.

  • Next-generation sequencing (NGS) technologies, such as pyrosequencing, nanopore, and Illumina sequencing, allow for massively parallel sequencing without prior knowledge of the sequence.

  • NGS produces large numbers of short reads that must be computationally assembled.

Structure of dideoxy ATP and its role in Sanger sequencing Sanger sequencing process and capillary electrophoresis

Comparison of Sequencing Methods

Sequencing technologies are classified by generation, each with distinct methods and features.

Generation

Method

Features

First

Sanger dideoxy method

Read length: 700–900 bases; used for Human Genome Project

Second

Pyrosequencing, Illumina, SOLiD

Shorter reads (35–700 bases); high throughput; used for large-scale projects

Third

Helicos, PacBio

Single-molecule sequencing; longer reads (up to 15 kb)

Fourth

Oxford Nanopore

Ultra-long reads (up to 900 kb); portable devices

Table of DNA sequencing methods

Genome Assembly and Annotation

Genome assembly involves aligning overlapping short reads to reconstruct the original DNA sequence. Annotation identifies open reading frames (ORFs) and other functional elements.

  • Computers merge overlapping sequences into contigs, which are further organized into scaffolds.

  • Annotation is often the bottleneck, as it requires identifying genes, regulatory regions, and non-coding RNAs.

  • Prokaryotic genomes are typically compact, with ORFs separated by short regulatory regions.

Genome assembly process from fragments to contigs Structure and identification of an ORF

Comparative Genomics: Genome Size and Gene Content

Genome Size and ORF Content

Comparative genomics examines similarities and differences in genome size and gene content across organisms. In prokaryotes, there is a strong correlation between genome size and the number of ORFs.

  • Each megabase pair (Mbp) of prokaryotic DNA encodes approximately 1,000 ORFs.

  • Gene content increases proportionally with genome size in prokaryotes.

  • Eukaryotic genomes contain large amounts of noncoding DNA, so gene density is lower.

Correlation between genome size and ORF content

Gene Content and Functional Categories

The distribution of gene functions varies with genome size. Larger genomes have more genes for regulatory and environmental adaptation functions.

  • Core cellular processes (DNA replication, transcription, translation) show minor variation in gene number.

  • Genes for signal transduction and transcriptional regulation increase with genome size, enabling metabolic versatility.

Relative percent of ORFs by function versus genome size

Organelle Genomes

Mitochondria and Chloroplasts

Mitochondria and chloroplasts are organelles with their own genomes, derived from endosymbiotic bacteria. Their genomes encode essential components for energy metabolism and gene expression.

  • Chloroplast genomes: Circular DNA, 120–160 kbp, encode rRNAs, tRNAs, and proteins for photosynthesis and gene expression.

  • Mitochondrial genomes: Encode proteins for oxidative phosphorylation, rRNAs, and tRNAs; often smaller and may be circular or linear.

  • Many organelle proteins are encoded by nuclear genes, reflecting gene transfer during evolution.

Genome Evolution: Gene Families, Duplications, and Horizontal Gene Transfer

Gene Families and Duplications

Gene families are groups of homologous genes that arise through gene duplication events. Duplications allow one gene copy to retain its original function while the other evolves new functions.

  • Homologs: Genes related by evolutionary ancestry.

  • Paralogs: Genes within the same organism that arose by duplication.

  • Orthologs: Genes in different organisms that originated from a common ancestor.

  • Gene duplication is a major driver of evolutionary innovation.

Gene duplication and evolution of new functions Paralogs and orthologs in gene evolution

Horizontal Gene Transfer (HGT) and the Mobilome

Horizontal gene transfer is the movement of genetic material between organisms, bypassing traditional inheritance. It is a key mechanism for microbial evolution and adaptation.

  • HGT occurs via transformation, transduction, and conjugation.

  • Mobile genetic elements (the mobilome) include plasmids, prophages, integrons, insertion sequences, and transposons.

  • HGT can introduce new metabolic capabilities or virulence factors.

Vertical vs. horizontal gene transfer mechanisms Mobile genetic elements and their roles

Core Genome, Pan Genome, and Chromosomal Islands

Core Genome and Pan Genome

The core genome consists of genes shared by all strains of a species, while the pan genome includes the core plus all accessory genes found in some but not all strains. This concept explains the genetic diversity within microbial species.

  • HGT and mobile elements contribute to the expansion of the pan genome.

  • Strains of the same species can differ greatly in gene content and capabilities.

Core genome and pan genome in Salmonella

Chromosomal Islands and Pathogenicity Islands

Chromosomal islands are large DNA segments in the chromosome that contain clusters of genes for specialized functions, such as virulence, symbiosis, or pollutant degradation. Pathogenicity islands are a type of chromosomal island that encode virulence factors.

  • Chromosomal islands often have different GC content and codon usage, suggesting horizontal acquisition.

  • They may be flanked by repeat sequences and can carry integrase genes.

  • Pathogenicity islands increase the genome size and virulence of pathogenic strains.

Chromosomal islands and mobile elements in a bacterial cell Genome map showing pathogenicity islands in E. coli

Pearson Logo

Study Prep