Genomic Technologies and Applications: DNA Sequencing, Genomics, and Multi-Omics

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Recombinant DNA Technology and Genetically Modified Organisms (GMOs)

Introduction to Recombinant DNA and GMOs

Recombinant DNA technology enables the manipulation and combination of DNA from different sources to create genetically modified organisms (GMOs). This technology is foundational for modern genetics, biotechnology, and genomics.

Genetically Modified Organisms (GMOs): Organisms whose genetic material has been altered using recombinant DNA methods to introduce new traits or functions.
Recombinant DNA: DNA molecules formed by laboratory methods of genetic recombination, such as molecular cloning, to bring together genetic material from multiple sources.
Applications: Agriculture (e.g., pest-resistant crops), medicine (e.g., insulin production), and research (e.g., gene function studies).

How are GMO plants made?

Polymerase Chain Reaction (PCR) and DNA Amplification

Principles and Applications of PCR

The polymerase chain reaction (PCR) is a technique used to amplify specific DNA segments, making millions of copies from a small initial sample. PCR is essential for genetic analysis, cloning, and sequencing.

Key Steps: Denaturation, annealing, and extension using DNA polymerase.
Applications: Genetic testing, cloning, forensic analysis, and preparation for sequencing.

DNA Sequencing Technologies

Overview of DNA Sequencing

DNA sequencing determines the precise order of nucleotides (A, T, C, G) in a DNA molecule. Sequencing technologies have evolved through three generations, each with distinct features and applications.

First Generation: Sanger sequencing (chain-termination method).
Second Generation: Next Generation Sequencing (NGS), e.g., Illumina platforms.
Third Generation: Single-molecule sequencing, e.g., PacBio and Nanopore.

DNA sequencer (Illumina NextSeq 500)

First Generation: Sanger Sequencing

Sanger sequencing uses chain-terminating dideoxynucleotides (ddNTPs) to generate DNA fragments of varying lengths, which are then separated by gel electrophoresis to determine the DNA sequence.

Chain Termination: Incorporation of ddNTPs halts DNA synthesis due to the absence of a 3'-OH group.
Detection: Fragments are separated by size, and the terminal nucleotide is identified by fluorescence or radioactivity.

Sanger sequencing gel and readout Structural difference between dNTP and ddNTP

Second Generation: Next Generation Sequencing (NGS)

NGS technologies, such as Illumina sequencing, enable massively parallel sequencing of millions of short DNA fragments, increasing throughput and reducing cost per base.

Sequencing by Synthesis: Each nucleotide incorporation is detected in real time using fluorescent labels.
Applications: Whole genome sequencing, transcriptomics, and large-scale genetic studies.

Massively parallel DNA sequencing Illumina sequencing steps

Third Generation: Single-Molecule Sequencing

Third-generation sequencing technologies, such as PacBio and Nanopore, sequence single DNA molecules in real time, producing long reads and enabling the analysis of complex genomic regions.

PACBIO SMRT Sequencing: Monitors DNA synthesis by a single polymerase in a nanowell.
Nanopore Sequencing: DNA strands pass through a nanopore, and changes in electrical current are used to identify bases.
Advantages: Long read lengths, real-time sequencing, and detection of epigenetic modifications.

Single molecule DNA sequencing (Nanopore)

Genomics and Genome Analysis

What is Genomics?

Genomics is the study of the complete set of DNA (genome) in an organism, including its structure, function, evolution, and mapping. It encompasses the analysis of all genes and their interactions.

Structural Genomics: Determining the DNA sequence and physical organization of the genome.
Functional Genomics: Assigning biological functions to genomic elements, such as genes and regulatory regions.
Comparative Genomics: Comparing genomes across species to identify conserved and unique features.

Genome Sequencing and Assembly

Whole genome sequencing (WGS) involves determining the complete DNA sequence of an organism's genome. Assembly reconstructs the genome from short sequence reads.

DNA Libraries: Collections of DNA fragments representing the genome, prepared for sequencing.
cDNA Libraries: Collections of DNA copies of mRNA, representing expressed genes.
Genome Assembly: Overlapping sequence reads are aligned to form contiguous sequences (contigs) and scaffolds.

DNA sequencing workflow Genomic DNA library preparation cDNA library preparation Genome assembly: contigs and scaffolds Genetic mapping vs. physical mapping

Human Genome Project (HGP)

The Human Genome Project was an international effort to sequence and map all human genes. It provided a reference genome and advanced our understanding of human genetics and disease.

Timeline: Draft released in 2000, declared complete in 2003, gapless assembly in 2022.
Methods: BAC/YAC cloning, Sanger sequencing, and later, NGS.
Outcomes: Identification of ~20,000 protein-coding genes, discovery of genetic variation (SNPs, CNVs), and comparative genomics insights.

Human Genome Project Reference genome assembly from multiple individuals

Features of the Human Genome

Genome Size: ~3.1 billion nucleotides
Protein-Coding Genes: ~20,000 (about 2% of the genome)
Genetic Variation: SNPs (single-nucleotide polymorphisms), CNVs (copy number variations)
ENCODE Project: Catalogs functional elements, including regulatory regions and non-coding RNAs.

SNP (Single Nucleotide Polymorphism)

Multi-Omics Technologies

Introduction to Multi-Omics

Multi-omics integrates data from genomics, transcriptomics, epigenomics, proteomics, and metabolomics to provide a comprehensive view of biological systems.

Genomics: Analysis of the entire DNA sequence.
Transcriptomics: Study of all expressed RNA molecules (mRNA, ncRNA).
Epigenomics: Analysis of DNA methylation, histone modifications, and chromatin structure.
Proteomics: Study of all proteins, their modifications, and functions.
Metabolomics: Analysis of metabolites and small molecules involved in metabolism.

Multi-omics technology overview Multi-omics technology summary

Transcriptomics

Transcriptomics analyzes gene expression at the RNA level, providing insights into which genes are active in specific cells or conditions.

RNA Sequencing (RNA-seq): Uses NGS to quantify and compare gene expression.
Applications: Cancer diagnostics, developmental biology, and disease research.
Bulk vs. Single-Cell RNA-seq: Bulk analyzes average expression; single-cell reveals cell-specific heterogeneity.

Transcriptomics workflow and applications

Epigenomics

Epigenomics studies heritable changes in gene expression that do not involve changes to the DNA sequence, such as DNA methylation and histone modification.

Methods: Whole genome bisulfite sequencing (WGBS), ATAC-seq, ChIP-seq.
Applications: Cancer research, developmental biology, environmental and aging studies.

Epigenomics methods and applications

Proteomics

Proteomics involves the large-scale study of proteins, including their expression, structure, and function. Mass spectrometry is a key tool in proteomics research.

LC-MS/MS: Liquid chromatography-tandem mass spectrometry separates and identifies peptides.
Applications: Cancer research, biomarker discovery, and functional annotation of the genome.

Proteomics workflow (LC-MS/MS)

Comparative Genomics and Metagenomics

Comparative Genomics

Comparative genomics compares genome sequences across species to identify conserved elements, gene functions, and evolutionary relationships.

Applications: Disease gene identification, evolutionary biology, and species adaptation studies.
Example: Human and Neanderthal genome comparison reveals shared and unique genetic features.

Metagenomics

Metagenomics involves sequencing DNA from entire microbial communities in natural environments, bypassing the need for culturing individual species.

Applications: Microbiome research, environmental monitoring, and discovery of novel genes and organisms.
Human Microbiome Project: Sequenced genomes of hundreds of human-associated microbes to study health and disease.

Functional Genomics

Goals and Approaches

Functional genomics aims to understand the roles of genes and other genomic elements by integrating data from transcriptomics, epigenomics, and proteomics.

Transcriptome: All RNA molecules transcribed from the genome.
Epigenome: All chemical modifications to DNA and histones.
Proteome: All proteins encoded by the genome.
Techniques: Microarrays, RNA-seq, ChIP-seq, mass spectrometry.

Summary Table: DNA Sequencing Technologies

Generation	Technology	Read Length	Throughput	Key Features
First	Sanger Sequencing	500-1,000 bp	Low	Accurate, single gene, slow
Second	Illumina (NGS)	50-500 bp	High	Massively parallel, short reads, cost-effective
Third	PACBIO, Nanopore	10,000+ bp	Very High	Long reads, real-time, single molecule

Additional info: The integration of multi-omics data is increasingly important for systems biology, personalized medicine, and understanding complex traits and diseases.