Skip to main content
Pearson+ LogoPearson+ Logo
Ch. 16 - Genomics: Genetics from a Whole-Genome Perspective
Sanders - Genetic Analysis: An Integrated Approach 3rd Edition
Sanders3rd EditionGenetic Analysis: An Integrated ApproachISBN: 9780135564172Not the one you use?Change textbook
Chapter 16, Problem 3b

When the whole-genome shotgun sequence of the Drosophila genome was assembled, it comprised 134 scaffolds made up of 1636 contigs. What is the difference between physical and sequence gaps?

Verified step by step guidance
1
Understand the concept of a 'scaffold' in genome assembly: A scaffold is a series of contigs (continuous sequences of DNA) that are ordered and oriented based on additional information, such as paired-end reads or other mapping data.
Define a 'physical gap': Physical gaps occur when there is missing DNA sequence information between contigs. These gaps exist because the DNA fragments covering these regions were not sequenced or were not included in the assembly due to technical limitations.
Define a 'sequence gap': Sequence gaps occur when the DNA sequence is known to exist but cannot be accurately determined due to repetitive sequences, low-quality data, or other sequencing challenges. These gaps are often represented by 'N' in the assembled sequence.
Compare physical and sequence gaps: Physical gaps represent regions where no sequencing data is available, while sequence gaps represent regions where sequencing data exists but cannot be resolved into a clear sequence. Physical gaps are typically larger and may require additional experimental methods to close, whereas sequence gaps can often be resolved computationally or with improved sequencing techniques.
Relate the concepts to the Drosophila genome assembly: In the Drosophila genome assembly, the 134 scaffolds and 1636 contigs indicate that both physical and sequence gaps were present. Physical gaps would correspond to regions where contigs could not be connected, while sequence gaps would be within contigs where the sequence could not be fully resolved.

Verified video answer for a similar problem:

This video solution was recommended by our tutors as helpful for the problem above.
Video duration:
1m
Was this helpful?

Key Concepts

Here are the essential concepts you must grasp in order to answer the question correctly.

Physical Gaps

Physical gaps refer to regions in a genome assembly where there is no sequence data available due to limitations in the sequencing technology or methodology. These gaps can occur when the DNA fragments used for sequencing do not overlap sufficiently, leaving portions of the genome unsequenced. Understanding physical gaps is crucial for assessing the completeness of a genome assembly.
Recommended video:
Guided course
11:19
Segmentation Genes

Sequence Gaps

Sequence gaps are specific areas within a genome assembly where the sequence is known to be incomplete or missing, often indicated by 'N' in the sequence data. These gaps can arise from repetitive regions that are difficult to sequence accurately or from errors during the assembly process. Identifying sequence gaps is important for evaluating the quality and accuracy of the assembled genome.
Recommended video:
Guided course
08:41
Sequencing Difficulties

Contigs and Scaffolds

Contigs are contiguous sequences of DNA that have been assembled from overlapping fragments, representing a continuous stretch of the genome. Scaffolds, on the other hand, are larger structures that consist of multiple contigs linked together, often with gaps in between. Understanding the relationship between contigs and scaffolds is essential for interpreting genome assemblies and the presence of gaps.
Recommended video:
Guided course
04:17
Traditional vs. Next-Gen