Skip to main content
Back

Recombinant DNA Technology, DNA Sequencing, and Genomic Analysis

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Recombinant DNA Technology and Genetically Modified Organisms (GMOs)

Introduction to Recombinant DNA and GMOs

Recombinant DNA technology enables the manipulation and combination of DNA from different sources, leading to the creation of genetically modified organisms (GMOs). This technology is foundational in modern genetics, biotechnology, and medicine.

  • Recombinant DNA: DNA molecules formed by laboratory methods of genetic recombination to bring together genetic material from multiple sources.

  • Genetically Modified Organisms (GMOs): Organisms whose genetic material has been altered using recombinant DNA technology.

  • Applications: Agriculture (GMO crops), medicine (insulin production), research (gene function studies).

How are GMO plants made?

Polymerase Chain Reaction (PCR) and DNA Amplification

Principles and Applications of PCR

The polymerase chain reaction (PCR) is a technique used to amplify specific DNA segments, making millions of copies from a small initial sample. PCR is essential for cloning, sequencing, and genetic analysis.

  • Key Steps: Denaturation, annealing, and extension.

  • Applications: Genetic testing, forensics, cloning, and sequencing.

DNA Sequencing Technologies

Overview of DNA Sequencing

DNA sequencing determines the precise order of nucleotides within a DNA molecule. Sequencing technologies have evolved through three generations, each with distinct features and applications.

  • Goal: Identify the complete sequence of nucleotide bases (A, T, C, G) in a DNA sample.

  • Challenge: No machine can sequence an entire genome in one piece; DNA must be fragmented and sequenced in parts.

First Generation: Sanger Sequencing

Sanger sequencing, also known as chain-termination sequencing, was the first widely used method for DNA sequencing. It uses dideoxynucleotides (ddNTPs) to terminate DNA synthesis at specific bases, allowing the sequence to be determined.

  • Key Principle: Incorporation of ddNTPs, which lack a 3'-OH group, terminates DNA strand elongation.

  • Detection: Fragments are separated by size using gel electrophoresis, and the sequence is read from the resulting pattern.

  • Limitations: Low throughput, suitable for single genes or small DNA fragments.

Original Sanger sequencing gel and readout Structural difference between dNTP and ddNTP

Sanger Sequencing Workflow

  • DNA is amplified and denatured.

  • Mixture of dNTPs and fluorescently-labeled ddNTPs is added.

  • Chain termination occurs at each base, producing fragments of varying lengths.

  • Fragments are separated by capillary gel electrophoresis.

  • Laser excitation and detection produce a chromatogram for sequence determination.

PCR with fluorescent, chain-terminating ddNTPs Size separation by capillary gel electrophoresis Laser excitation and detection by sequencing machine

Second Generation: Next Generation Sequencing (NGS)

Next Generation Sequencing (NGS) technologies, such as Illumina sequencing, allow for massively parallel sequencing of millions of short DNA fragments, greatly increasing throughput and reducing cost per base.

  • Key Features: Sequencing by synthesis, short reads, high throughput, automation.

  • Applications: Whole genome sequencing, transcriptomics, metagenomics.

Massively parallel DNA sequencing Illumina sequencing steps

Third Generation: Single-Molecule Sequencing

Third generation sequencing technologies, such as PacBio and Oxford Nanopore, sequence single DNA molecules in real time, producing long reads and enabling the analysis of complex genomic regions.

  • Key Features: Long reads, real-time sequencing, high throughput, direct detection of base modifications.

  • Applications: De novo genome assembly, structural variant detection, epigenetic analysis.

Single molecule DNA sequencing - Nanopore PacBio SMRT sequencing workflow

Comparison of Sequencing Technologies

Generation

Technology

Read Length

Throughput

Key Features

1st

Sanger

500-1,000 bp

Low

Accurate, single gene

2nd

Illumina (NGS)

50-500 bp

High

Massively parallel, short reads

3rd

PacBio, Nanopore

10,000+ bp

Very High

Long reads, real-time

Genomic Analysis and Applications

What is Genomics?

Genomics is the study of an organism’s complete set of DNA, including its structure, function, evolution, and mapping. It encompasses the analysis of the entire genome, including coding and noncoding regions.

  • Structure/Mapping: Determining the DNA sequence and gene locations.

  • Function: Assigning biological functions to genomic elements and understanding gene regulation.

Genome Sequencing and Assembly

Whole genome sequencing (WGS) involves sequencing the entire genome and assembling the sequence reads into a complete genome. This process requires the construction of DNA libraries and the use of bioinformatics tools for assembly.

  • Genomic DNA Libraries: Collections of DNA fragments representing the entire genome.

  • cDNA Libraries: Collections of cDNA copies of mRNA, representing expressed genes.

  • Assembly: Overlapping sequence reads are matched to form contigs and scaffolds, which are then mapped to chromosomes.

Genomic DNA library construction cDNA library synthesis

The Human Genome Project (HGP)

The Human Genome Project was an international effort to sequence and map all human genes. It provided a reference genome and revealed the genetic similarity and diversity among humans and other species.

  • Timeline: Draft released in 2000, declared complete in 2003, final gapless assembly in 2022.

  • Methods: DNA fragments cloned into BACs/YACs, amplified, and sequenced using Sanger sequencing.

  • Findings: ~20,000 protein-coding genes, ~99.9% similarity among humans, identification of SNPs and CNVs.

Human Genome Project reference sequence composition

Genome Assembly: Contigs, Scaffolds, and Mapping

Genome assembly is the process of reconstructing the original genome from sequence reads. Contigs are continuous sequences formed by overlapping reads, and scaffolds are ordered sets of contigs. Genetic and physical maps are used to organize scaffolds into chromosomes.

  • Genetic Maps: Based on recombination frequency, measured in centimorgans (cM).

  • Physical Maps: Based on actual base-pair distances, measured in bp, kb, or Mb.

Multi-Omics Technologies

Overview of Omics

Multi-omics technologies integrate data from genomics, transcriptomics, epigenomics, proteomics, and metabolomics to provide a comprehensive view of biological systems.

  • Genomics: Analysis of DNA sequence and structure.

  • Transcriptomics: Analysis of RNA expression (mRNA, ncRNA).

  • Epigenomics: Analysis of DNA methylation, histone modification, and chromatin accessibility.

  • Proteomics: Analysis of protein expression and modification.

  • Metabolomics: Analysis of metabolites and small molecules.

Comparative Genomics

Comparative genomics involves comparing genome sequences across species to identify conserved and unique genetic elements, study evolutionary relationships, and understand genetic basis of diseases.

  • Applications: Disease gene identification, evolutionary studies, species adaptation analysis.

Metagenomics

Metagenomics is the study of genetic material recovered directly from environmental samples, allowing the analysis of entire microbial communities without culturing.

  • Applications: Discovery of new species, understanding microbial diversity, health and disease research.

Functional Genomics

Functional genomics aims to understand the roles and interactions of genes and other genomic elements, often using high-throughput techniques such as RNA sequencing and microarrays.

  • Transcriptome: All RNA molecules transcribed from the genome.

  • Epigenome: All chemical modifications to DNA and histones.

  • Proteome: All proteins encoded by the genome.

Transcriptomics

Transcriptomics studies gene expression at the RNA level, both qualitatively and quantitatively. RNA sequencing (RNA-seq) is a key method for transcriptome analysis.

  • Applications: Cataloging RNA content, comparing gene expression across tissues, cancer diagnostics.

  • Methods: Bulk RNA-seq for average expression, single-cell RNA-seq for cell-specific analysis.

Epigenomics

Epigenomics analyzes genome-wide epigenetic modifications, such as DNA methylation and histone modification, which regulate gene expression without altering the DNA sequence.

  • Methods: Whole genome bisulfite sequencing (WGBS), ATAC-seq, chromatin conformation capture.

  • Applications: Cancer research, developmental biology, environmental and aging studies.

Proteomics

Proteomics is the large-scale study of proteins, including their expression, structure, and function. Mass spectrometry (LC-MS/MS) is a primary tool for proteomic analysis.

  • Steps: Sample preparation, protein digestion, separation, mass spectrometry, data analysis, quantification.

  • Applications: Comparing protein expression in healthy vs. diseased cells, biomarker discovery.

Summary Table: Multi-Omics Technologies

Omics Field

Analyte

Key Methods

Applications

Genomics

DNA

Sequencing, WGS

Gene discovery, disease genetics

Transcriptomics

RNA

RNA-seq, microarrays

Gene expression, diagnostics

Epigenomics

Epigenetic marks

WGBS, ATAC-seq

Gene regulation, cancer research

Proteomics

Proteins

LC-MS/MS

Protein function, biomarker discovery

Metabolomics

Metabolites

Mass spectrometry

Metabolic profiling, disease research

Pearson Logo

Study Prep