Skip to main content
Back

Genomic Technologies and Applications: DNA Sequencing, Genomics, and Multi-Omics

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Recombinant DNA Technology and Genetically Modified Organisms (GMOs)

Introduction to GMOs and Recombinant DNA

Recombinant DNA technology enables the creation of genetically modified organisms (GMOs) by combining DNA from different sources. This technology is foundational in modern genetics, agriculture, and medicine.

  • Genetically Modified Organisms (GMOs): Organisms whose genetic material has been altered using recombinant DNA methods.

  • Recombinant DNA: DNA molecules formed by laboratory methods of genetic recombination to bring together genetic material from multiple sources.

  • Applications: Agriculture (GMO crops), medicine (insulin production), research (gene function studies).

How are GMO plants made?

DNA Sequencing Technologies

Overview of DNA Sequencing

DNA sequencing is the process of determining the precise order of nucleotides within a DNA molecule. Advances in sequencing technologies have revolutionized genomics, enabling high-throughput and cost-effective analysis of entire genomes.

  • Goal: Identify the sequence of nucleotide bases (A, T, C, G) in DNA samples.

  • Applications: Genome mapping, disease gene identification, evolutionary studies.

Generations of DNA Sequencing

DNA sequencing technologies are classified into three generations, each with distinct methodologies and capabilities.

  • First Generation: Sanger sequencing – chain-termination method, low throughput, high accuracy for small fragments.

  • Second Generation: Next Generation Sequencing (NGS) – massively parallel, short reads, high throughput, cost-effective.

  • Third Generation: Single-molecule sequencing (e.g., PacBio, Nanopore) – long reads, real-time sequencing, suitable for complex genomes.

Sanger Sequencing (First Generation)

Sanger sequencing uses chain-terminating dideoxynucleotides (ddNTPs) to generate DNA fragments of varying lengths, which are then separated and analyzed to determine the DNA sequence.

  • Principle: Incorporation of ddNTPs terminates DNA synthesis at specific bases due to the absence of a 3'-OH group.

  • Detection: Fragments are separated by size using gel electrophoresis; fluorescent or radiolabeled ddNTPs allow for base identification.

  • Applications: Sequencing single genes or small DNA fragments.

Original Sanger sequencing gel and readout Structural difference between dNTP and ddNTP

Sanger Sequencing Steps

  1. PCR with fluorescent, chain-terminating ddNTPs

  2. Size separation by capillary gel electrophoresis

  3. Laser excitation and detection by sequencing machine

PCR with fluorescent, chain-terminating ddNTPs Size separation by capillary gel electrophoresis Laser excitation and detection by sequencing machine

Next Generation Sequencing (NGS, Second Generation)

NGS technologies, such as Illumina sequencing, enable massively parallel sequencing of millions of short DNA fragments, greatly increasing throughput and reducing costs.

  • Sequencing by Synthesis: Each nucleotide addition is detected in real time, allowing for high-throughput data collection.

  • Applications: Whole genome sequencing, transcriptomics, metagenomics.

  • Advantages: High throughput, cost-effective, suitable for large-scale studies.

Massively parallel DNA sequencing Illumina NextSeq 500 DNA sequencer Illumina sequencing steps

Third Generation Sequencing

Third generation sequencing technologies, such as PacBio and Nanopore, sequence single DNA molecules in real time, producing long reads that facilitate assembly of complex genomes.

  • Single-Molecule Real-Time (SMRT) Sequencing: DNA polymerase synthesizes DNA in a nanowell, and nucleotide incorporation is detected in real time.

  • Nanopore Sequencing: DNA passes through a nanopore, and changes in electrical current are used to identify bases.

  • Advantages: Long read lengths, real-time data, ability to resolve repetitive regions.

Single molecule DNA sequencing - Nanopore Portable Nanopore sequencer SMRT sequencing workflow

Comparison of Sequencing Technologies

Generation

Technology

Read Length

Throughput

Key Features

First

Sanger

500-1,000 bp

Low

High accuracy, single gene

Second

Illumina (NGS)

50-500 bp

High

Massively parallel, short reads

Third

PACBIO, Nanopore

10,000+ bp

Very High

Long reads, real-time

Genomics and Genome Analysis

What is Genomics?

Genomics is the study of the complete set of DNA (genome) in an organism. It encompasses the structure, function, evolution, and mapping of genomes.

  • Structural Genomics: Determining the DNA sequence and physical structure of genomes.

  • Functional Genomics: Assigning biological functions to genomic elements (genes, regulatory elements).

  • Comparative Genomics: Comparing genomes across species to identify conserved and unique features.

Genome Sequencing and Assembly

Whole genome sequencing (WGS) involves sequencing the entire genome and assembling the sequence reads into a complete genome.

  • DNA Libraries: Collections of DNA fragments representing the genome, used for sequencing.

  • Genomic Libraries: Contain all DNA, including coding and noncoding regions.

  • cDNA Libraries: Contain only expressed genes (mRNA-derived).

  • Genome Assembly: Overlapping sequence reads are merged to form contigs and scaffolds, which are then mapped to chromosomes.

DNA sequencing workflow Genomic DNA library construction cDNA library synthesis

Human Genome Project (HGP)

The Human Genome Project was an international effort to sequence and map all human genes. It provided a reference genome for biomedical research and comparative genomics.

  • Timeline: Draft released in 2000, declared complete in 2003, gapless assembly in 2022.

  • Methods: BAC/YAC cloning, Sanger sequencing, assembly of contigs and scaffolds.

  • Outcomes: Identification of ~20,000 protein-coding genes, discovery of genetic variation (SNPs, CNVs), and insights into human evolution.

Human Genome Project timeline and reference genome

Genome Assembly: Contigs, Scaffolds, and Chromosomes

Genome assembly is the process of reconstructing the original genome from short sequence reads.

  • Contig: A continuous sequence assembled from overlapping reads.

  • Scaffold: A series of contigs joined together using additional information (e.g., genetic maps).

  • Chromosome: The final, ordered assembly of scaffolds.

Genetic Mapping vs. Physical Mapping

Type

Basis

Unit

Resolution

Genetic Map

Recombination frequency

centimorgan (cM)

Low (marker order)

Physical Map

Physical distance

base pairs (bp, kb, Mb)

High (precise location)

Features of the Human Genome

  • Genome Size: ~3.1 billion nucleotides

  • Protein-Coding Genes: ~20,000 (about 2% of genome)

  • Genetic Diversity: ~99.9% identical among humans; diversity due to SNPs and CNVs

  • ENCODE Project: Catalogs functional elements, including regulatory regions and non-coding RNAs

Multi-Omics Technologies

Overview of Multi-Omics

Multi-omics integrates data from various molecular layers to provide a comprehensive view of biological systems.

  • Genomics: Study of DNA sequences and genetic variation.

  • Transcriptomics: Analysis of RNA transcripts (gene expression).

  • Epigenomics: Study of epigenetic modifications (DNA methylation, histone modification).

  • Proteomics: Analysis of the complete set of proteins.

  • Metabolomics: Study of metabolites and metabolic pathways.

Transcriptomics

Transcriptomics examines the complete set of RNA transcripts in a cell or tissue, providing insights into gene expression and regulation.

  • RNA Sequencing (RNA-seq): Uses NGS to quantify and compare gene expression levels.

  • Applications: Cancer diagnostics, developmental biology, disease research.

  • Bulk RNA-seq: Measures average expression across many cells.

  • Single-cell RNA-seq: Resolves expression in individual cells, revealing heterogeneity.

Epigenomics

Epigenomics studies heritable changes in gene expression that do not involve changes to the DNA sequence.

  • DNA Methylation: Addition of methyl groups to DNA, often silencing genes.

  • Histone Modification: Chemical changes to histone proteins affecting chromatin structure.

  • Chromatin Accessibility: Open chromatin regions are more accessible for transcription.

  • Applications: Cancer research, developmental biology, environmental studies.

Proteomics

Proteomics is the large-scale study of proteins, including their structure, function, and modifications.

  • LC-MS/MS: Liquid chromatography-tandem mass spectrometry is used to identify and quantify proteins.

  • Applications: Disease biomarker discovery, drug development, functional annotation.

Metagenomics

Metagenomics involves sequencing DNA from entire microbial communities, providing insights into biodiversity and ecosystem function.

  • Applications: Environmental monitoring, human microbiome studies, discovery of novel genes.

  • Case Study: NYC Subway Microbiota, Human Microbiome Project.

Comparative Genomics

Comparative genomics compares genome sequences across species to identify conserved elements, gene functions, and evolutionary relationships.

  • Applications: Disease gene identification, evolutionary biology, functional annotation.

  • Example: Neanderthal genome sequencing revealed 1–4% of non-African human DNA is inherited from Neanderthals.

Summary Table: Multi-Omics Approaches

Omics Layer

Analyte

Key Methods

Applications

Genomics

DNA

WGS, NGS

Genetic variation, disease genes

Transcriptomics

RNA

RNA-seq

Gene expression, diagnostics

Epigenomics

DNA/histones

WGBS, ChIP-seq, ATAC-seq

Gene regulation, cancer

Proteomics

Proteins

LC-MS/MS

Protein function, biomarkers

Metabolomics

Metabolites

MS, NMR

Metabolic pathways, disease

Pearson Logo

Study Prep