NGS FAQ

Recommended sequencing depth.

Sequencing depth is one of the most important experimental-design decisions in NGS. Too little depth can miss variants, transcripts or peaks; too much depth can waste budget if the library is already saturated. The right target depends on assay type, organism, sample quality, biological question and downstream analysis.

Quick answer

Recommended sequencing depth depends on the assay and endpoint. Human short-read WGS often starts around 30× to 50× mean coverage, WES often targets around 100× mean coverage, while RNA-seq is usually planned in millions of reads per sample rather than genome coverage.

Bulk RNA-seq may range from a few million reads for simple expression profiling to more than 100 million reads for deep transcriptome discovery. Single-cell RNA-seq is typically planned in reads per cell. Targeted panels, rare-variant assays, cell-free DNA and UMI-based workflows require separate planning because raw depth and usable molecular depth can differ substantially.

Practical rule: choose sequencing depth from the biological question first, then adjust it for genome size, library complexity, expected signal, sample quality, replicate number, read length, target region size and required sensitivity.

Coverage and read depth are related but not identical

For DNA-seq, coverage is the average number of reads that cover bases in the reference or target region. For RNA-seq, “coverage” is less straightforward because transcripts differ greatly in abundance, length and isoform complexity; therefore RNA-seq is usually described using total reads or read pairs per sample.

Basic coverage formula
C = L × N / G

C is average coverage, L is read length, N is the number of reads, and G is the genome or target size. In practice, mapped coverage, usable reads, duplication, target enrichment and coverage uniformity must also be evaluated.

Raw depth Total sequencing output before alignment, filtering and duplicate handling. Useful for run planning but not enough for interpretation.
Mapped depth Reads that align to the expected genome, transcriptome or target. This is more informative than raw output.
Unique usable depth Reads or fragments that remain after filters such as mapping quality, duplicates, blacklists or off-target removal.
Molecular depth Number of original molecules represented after UMI collapsing or consensus generation. Critical for UMI panels and rare-variant assays.

Recommended sequencing-depth starting points

Assay type Typical starting depth Comments
Human WGS, germline short-read 30×–50× mean coverage Common range for SNVs and small indels. CNVs, SVs, mosaicism, difficult regions or clinical requirements may change the target.
Human WES ~100× mean target coverage Mean target coverage is not enough by itself. Coverage uniformity and percentage of target bases above thresholds such as 20× or 30× are essential.
Targeted DNA panels 250×–1000×+ raw depth Depends on target size, variant allele frequency, tumour purity, UMI use and validation requirements. UMI panels should report molecular depth.
Rare variant / cfDNA / MRD-style assays Often 1000×+ raw, molecule-dependent Raw read depth alone is insufficient. UMI consensus, error suppression, input mass and molecular coverage are decisive.
Bulk mRNA-seq, simple expression profiling 5–25 million reads/sample Suitable for highly expressed genes or quick expression snapshots when subtle splicing or low-expression genes are not the primary endpoint.
Bulk whole-transcriptome RNA-seq 30–60 million reads/sample Common starting range for many mammalian differential-expression projects and some alternative-splicing information.
Deep transcriptome / transcript discovery 100–200 million reads/sample Useful for transcript assembly, low-expression transcripts, complex splicing or discovery-oriented projects.
Small RNA-seq / miRNA-seq 1–5 million reads/sample Often sufficient for many applications, but tissue type, sample complexity and desired small-RNA classes matter.
Single-cell RNA-seq, 3′/5′ gene expression ~20,000+ read pairs/cell Starting point for many 10x-style gene-expression projects. Higher depth can help with low-abundance transcripts or richer transcriptomes.
ChIP-seq Often 10–50+ million usable reads Sharp transcription-factor peaks may need less than broad histone marks. FRiP, library complexity and replicate concordance are critical.
ATAC-seq Often 25–50 million paired-end reads/sample Depth must be interpreted with insert-size profile, TSS enrichment, mitochondrial fraction, FRiP and duplicate rate.
WGBS / bisulfite-seq Often 20×–30×+ genome coverage More depth may be required for allele-specific methylation, low-input samples, non-human genomes or high-confidence CpG coverage.
Metagenomics Highly variable Depends on host fraction, microbial diversity, target abundance, strain-level goals, assembly requirements and functional profiling.
These values are planning ranges, not universal rules. A pilot dataset or saturation analysis is often the best way to refine depth for a new protocol, organism or sample type.

Sequencing depth for DNA-seq

DNA-seq depth is usually driven by the variant class and sensitivity requirement. Germline SNVs and small indels generally require less depth than low-frequency somatic variants, mosaic variants, minimal residual disease, or complex structural alterations.

Whole-genome sequencing Mean coverage is useful, but also check coverage uniformity, GC bias, insert size, duplication rate, contamination and percent of genome covered above analysis thresholds.
Exome and panels Target coverage is uneven because capture efficiency differs across regions. Report percentage of target bases covered at clinically or scientifically relevant thresholds.
Somatic variant calling Required depth depends on expected variant allele fraction, tumour purity, normal contamination, subclonality and error model.
UMI-based sequencing Do not rely only on raw read depth. Count unique molecules, family size and consensus-level support.

Sequencing depth for RNA-seq

RNA-seq depth depends on whether the goal is simple gene expression, differential expression, isoform-level quantification, fusion detection, alternative splicing, transcript discovery or low-expression gene detection.

RNA-seq goal Practical depth range Notes
Highly expressed genes 5–25M reads/sample May be sufficient for broad expression profiles and large expression effects.
Standard mammalian mRNA-seq 30–60M reads/sample Common starting point for differential expression and moderate transcriptome complexity.
Alternative splicing / isoforms 60–100M+ reads/sample Paired-end stranded reads are often helpful. Transcript annotation and read length matter.
Transcript discovery or assembly 100–200M reads/sample Higher depth and paired-end data improve discovery, but long-read RNA sequencing may also be useful.
Small RNA / miRNA 1–5M reads/sample Adapter trimming and read-length distribution are especially important.

Sequencing depth for single-cell RNA-seq

Single-cell RNA-seq is usually planned in read pairs per cell. Lower depth can be sufficient for cell-type annotation when cell types are distinct and abundant. Higher depth is useful when the goal is to detect subtle states, low-expression genes, rare transcripts, perturbation effects or richer transcriptomes.

Cell-type atlas Moderate reads per cell may be enough when the goal is broad clustering and annotation.
Functional interpretation Higher reads per cell may improve detection of low-abundance genes, pathways and cell-state markers.
Multiplexing Depth planning must include expected cell recovery, doublet rate, cell hashing or sample multiplexing and desired cells per condition.
Saturation Library saturation and median genes per cell help determine whether more sequencing is likely to improve results.

Sequencing depth for epigenomics

ChIP-seq, ATAC-seq, CUT&RUN, CUT&Tag and bisulfite sequencing require interpretation beyond raw read number. Signal-to-noise, enrichment, duplicate rate, usable fragments, mitochondrial fraction, target-region complexity and replicate concordance are central.

ChIP-seq Sharp transcription-factor peaks often need fewer usable reads than broad histone marks. Broad marks and weak antibodies may require deeper sequencing and careful controls.
ATAC-seq Paired-end sequencing is often preferred because fragment-size information is biologically informative. TSS enrichment and FRiP should be evaluated alongside depth.
Bisulfite-seq Depth should be planned at cytosine or CpG level. Conversion efficiency, mapping rate and duplication rate strongly affect usable methylation calls.
CUT&RUN / CUT&Tag These assays may require fewer reads than conventional ChIP-seq when enrichment is strong, but depth depends on target abundance and background.

Sequencing depth for metagenomics

Metagenomic sequencing depth varies more than most NGS assays. The required depth depends on whether the aim is taxonomic profiling, detection of rare organisms, strain-level analysis, assembly, antimicrobial resistance profiling, viral detection or functional annotation.

  • 16S/amplicon profiling: fewer reads may be sufficient for broad taxonomic composition, but primer bias and database choice matter.
  • Shotgun metagenomics: depth depends strongly on host read fraction and microbial complexity.
  • Assembly: more depth is usually required, especially for complex communities and low-abundance organisms.
  • Rare pathogen detection: required depth can be very high if the target organism is rare or host background is large.

Factors that change recommended depth

Recommended depth should be adjusted when any of the following factors apply.

Biological effect size Small effects, rare variants and low-abundance transcripts require more evidence than large, robust signals.
Genome or transcriptome size Larger and more complex genomes or transcriptomes need more reads for comparable sensitivity.
Sample quality Degraded RNA, FFPE DNA, low-input libraries or contamination can reduce usable depth.
Library complexity When a library is saturated, extra sequencing mainly increases duplicates rather than new information.
Target size Smaller target panels can be sequenced more deeply, but molecular diversity and UMI family size become important.
Replication For many biological questions, more biological replicates can be more valuable than extreme depth in a few samples.

Recommended sequencing-depth planning workflow

1. Define endpoint Variant calling, expression, splicing, peaks, methylation, assembly or detection.
2. Choose assay WGS, WES, panel, RNA-seq, scRNA-seq, ChIP-seq, ATAC-seq or metagenomics.
3. Estimate depth Use a planning range and adjust for organism, target size, signal and QC risks.
4. Validate with QC Evaluate mapped reads, saturation, duplication, coverage uniformity and performance metrics.

When a pilot run is useful

A pilot run is valuable for new protocols, uncommon organisms, low-input samples, FFPE material, complex metagenomes, rare-event detection, or when the target cell type or transcriptome complexity is unknown.

Common mistakes to avoid

Using mean coverage alone Mean coverage can hide poorly covered regions. Always check coverage distribution and target-region thresholds.
Ignoring duplicate rate High raw depth may not help if the library has low complexity and most additional reads are duplicates.
Underpowering replicates For differential expression or epigenomics, biological replication is often more important than very high depth in a few samples.
Planning RNA-seq like DNA-seq RNA-seq depth depends on expression distributions and transcript complexity, not uniform genome coverage.
Ignoring UMIs For UMI assays, usable molecular depth matters more than raw read count.
Not defining sensitivity Variant allele fraction, minimum transcript abundance or minimum peak strength should be specified before choosing depth.

How SciBerg supports sequencing-depth planning

SciBerg can help estimate and document sequencing-depth requirements before sequencing, and evaluate whether delivered data are sufficient after sequencing.

  • Assay-specific sequencing-depth recommendations.
  • Sample sheet and metadata review.
  • Coverage and read-depth calculations.
  • FASTQ, alignment and library-complexity QC.
  • RNA-seq saturation and expression-level assessment.
  • Target-panel coverage and uniformity reports.
  • Single-cell sequencing saturation and reads-per-cell evaluation.
  • Clear reporting of whether additional sequencing is likely to help.

Frequently asked questions

What is sequencing depth?

Sequencing depth describes how much sequencing data is generated for a sample. For DNA sequencing it is often expressed as coverage, such as 30× genome coverage. For RNA-seq it is usually expressed as the number of reads or read pairs per sample.

Is more sequencing depth always better?

No. More depth improves sensitivity only until the assay reaches a practical saturation point. After that, extra sequencing may mostly increase duplicates, cost and storage without improving interpretation.

What is a typical depth for human whole-genome sequencing?

For many human short-read WGS projects, 30× to 50× mean coverage is a common starting range. Some applications, such as low-frequency somatic variant detection, mosaicism, metagenomic contamination, or difficult regions, may require different strategies or deeper sequencing.

How many reads are needed for bulk RNA-seq?

For bulk mRNA-seq, simple expression profiling may need about 5–25 million reads per sample, many standard whole-transcriptome projects use about 30–60 million reads, and deep transcript discovery or complex splicing analysis may require 100–200 million reads per sample.

How many reads per cell are needed for single-cell RNA-seq?

For 10x-style 3′ or 5′ gene-expression libraries, a common minimum starting point is about 20,000 read pairs per captured cell, with deeper sequencing considered when detecting low-abundance transcripts, rare cell states or richer transcriptomes.

Should sequencing depth be planned per sample or per project?

Both. Depth should be chosen per sample based on the assay and biological question, while the total project design must also consider number of biological replicates, batch structure, multiplexing, target outputs and budget.

Selected references and documentation

The following resources were used as public guidance points for sequencing-depth planning. Specific study designs should still be adjusted to organism, protocol, sample quality and project endpoint.