Repetitive DNA poses problems for genome sequencing. Why is this so?
Ch. 16 - Genomics: Genetics from a Whole-Genome Perspective

Sanders3rd EditionGenetic Analysis: An Integrated ApproachISBN: 9780135564172Not the one you use?Change textbook
All textbooks
Sanders 3rd Edition
Ch. 16 - Genomics: Genetics from a Whole-Genome Perspective
Problem 3a
Sanders 3rd Edition
Ch. 16 - Genomics: Genetics from a Whole-Genome Perspective
Problem 3aChapter 16, Problem 3a
When the whole-genome shotgun sequence of the Drosophila genome was assembled, it comprised 134 scaffolds made up of 1636 contigs. Why were there so many more contigs than scaffolds?
Verified step by step guidance1
Understand the terms: A 'contig' is a continuous sequence of DNA assembled from overlapping reads, while a 'scaffold' is a larger structure formed by linking contigs together using additional information, such as paired-end reads or known physical distances.
Recognize that contigs are the initial building blocks of genome assembly. They are created by aligning and merging overlapping DNA sequence reads, but they do not necessarily span the entire genome due to gaps or repetitive sequences.
Scaffolds are formed by connecting contigs using information like paired-end reads, which provide spatial relationships between contigs. This allows scaffolds to span gaps that contigs cannot bridge, resulting in fewer scaffolds than contigs.
Consider the limitations of sequencing technology: Gaps between contigs often arise due to repetitive sequences, low coverage regions, or sequencing errors. These gaps prevent contigs from being merged directly, leading to a higher number of contigs compared to scaffolds.
Reflect on the assembly process: The assembly algorithm prioritizes creating scaffolds that represent larger, more complete sections of the genome. However, the presence of unresolved gaps and ambiguities means that many contigs remain unlinked, contributing to the higher count of contigs relative to scaffolds.

Verified video answer for a similar problem:
This video solution was recommended by our tutors as helpful for the problem above.
Video duration:
2mWas this helpful?
Key Concepts
Here are the essential concepts you must grasp in order to answer the question correctly.
Contigs and Scaffolds
Contigs are contiguous sequences of DNA that are assembled from overlapping DNA fragments, representing a continuous stretch of the genome. Scaffolds, on the other hand, are larger structures that consist of multiple contigs linked together, often with gaps. The difference in their numbers arises because scaffolds are formed by connecting contigs, which can lead to many more contigs than scaffolds in a genome assembly.
Recommended video:
Guided course
Traditional vs. Next-Gen
Genome Assembly
Genome assembly is the process of reconstructing the complete DNA sequence of an organism's genome from short DNA fragments obtained through sequencing. This process involves aligning and merging overlapping sequences to form longer contiguous sequences (contigs) and then organizing these into scaffolds. The complexity of the genome and the quality of the sequencing data can significantly affect the number of contigs and scaffolds produced.
Recommended video:
Guided course
Genomics Overview
Sequencing Technology Limitations
The limitations of sequencing technologies can lead to the generation of numerous short reads that may not overlap perfectly, resulting in many contigs. Factors such as repetitive regions in the genome, sequencing errors, and the inherent difficulty in assembling complex genomic regions contribute to the formation of more contigs than scaffolds. These challenges necessitate advanced computational methods to accurately assemble the genome.
Recommended video:
Guided course
Sequencing Overview
Related Practice
Textbook Question
571
views
Textbook Question
Repetitive DNA poses problems for genome sequencing. What types of repetitive DNA are most problematic?
796
views
Textbook Question
Repetitive DNA poses problems for genome sequencing. What strategies can be employed to overcome these problems?
971
views
Textbook Question
When the whole-genome shotgun sequence of the Drosophila genome was assembled, it comprised 134 scaffolds made up of 1636 contigs. What is the difference between physical and sequence gaps?
449
views
Textbook Question
When the whole-genome shotgun sequence of the Drosophila genome was assembled, it comprised 134 scaffolds made up of 1636 contigs. How can physical gaps be closed?
487
views
Textbook Question
When the whole-genome shotgun sequence of the Drosophila genome was assembled, it comprised 134 scaffolds made up of 1636 contigs.
How can sequence gaps be closed?
498
views