Trimming and filtering of NGS reads with Cutadapt.
This tutorial shows how to remove adapters, poly-A tails, primers, low-quality bases, and unwanted read-length ranges from sequencing reads. These preprocessing steps are often required before alignment, quantification, variant calling, or downstream statistical analysis.
Read trimming and filtering are common preprocessing steps in NGS analysis. They can remove technical sequences introduced during library preparation, low-quality read ends, poly-A or poly-T stretches, fixed barcodes, and very short reads that are unlikely to align or quantify reliably.
Cutadapt is a flexible command-line tool for removing adapters, primers, poly-A tails, and other unwanted sequences from high-throughput sequencing reads. It also supports quality trimming, length filtering, fixed-base removal, and paired-end processing.
InputFASTQ or FASTQ.GZ files from single-end or paired-end sequencing.
OutputCleaned FASTQ files and trimming logs for documentation and downstream analysis.
Install Cutadapt
For reproducible bioinformatics work, the recommended installation method is usually a Conda or Mamba environment with the Bioconda and conda-forge channels configured.
Install with Mamba or Conda
# Create a clean environment for read trimming
mamba create -n ngs-trim -c conda-forge -c bioconda cutadapt
# Activate the environment
mamba activate ngs-trim
# Check the installed version
cutadapt --version
Alternative installation with pip
# Use this only if a Python environment is already managed carefully
python -m pip install --upgrade cutadapt
# Check the installation
cutadapt --version
Why avoid old manual setup commands?
Older tutorials often used manual source downloads and python setup.py install. For modern projects, Conda/Mamba or pip-based installation is usually easier to reproduce and maintain.
Core Cutadapt commands
Replace adapter sequences with the exact adapter or primer sequences used in your library preparation protocol. The examples below use common Illumina-style adapter fragments and should be adapted to your experiment.
# Trim low-quality bases from the 3′ end
cutadapt \
-q 20 \
-o sample.q20.fastq.gz \
sample.fastq.gz
# Trim 5′ and 3′ ends with different quality cutoffs
cutadapt \
-q 15,20 \
-o sample.qtrim.fastq.gz \
sample.fastq.gz
Remove fixed bases from the beginning or end
# Remove the first 3 bases from each read
cutadapt \
-u 3 \
-o sample.trim5p.fastq.gz \
sample.fastq.gz
# Remove 5 bases from the 3′ end
cutadapt \
-u -5 \
-o sample.trim3p.fastq.gz \
sample.fastq.gz
Trim poly-A tails
# Modern Cutadapt versions support poly-A/poly-T trimming
cutadapt \
--poly-a \
-m 18 \
-o sample.polyA_trimmed.fastq.gz \
sample.fastq.gz
Size-select reads
# Keep reads at least 18 nt long and at most 35 nt long
cutadapt \
-m 18 \
-M 35 \
-o sample.18to35nt.fastq.gz \
sample.fastq.gz
-aTrim a 3′ adapter from single-end reads or read 1.
-ATrim a 3′ adapter from read 2 in paired-end data.
-gTrim a 5′ adapter or primer from single-end reads or read 1.
-qTrim low-quality bases from read ends before adapter trimming.
-mDiscard reads shorter than the specified minimum length.
-MDiscard reads longer than the specified maximum length.
Paired-end read trimming
Paired-end processing keeps read pairs synchronized. Use -o for read 1 output and -p for read 2 output.
For paired-end libraries, adapter sequences for read 1 and read 2 can differ. Always confirm the adapter sequences in the protocol or sample sheet used for the specific sequencing run.
Adapter examples for common library preparation kits
The current SciBerg resource lists example Cutadapt commands for several library preparation kits. The examples below preserve those adapter patterns while presenting them in a cleaner format.
CATS RNA and DNA library preparation kits
# Read 1: remove first 3 bases, poly-A-like stretches, adapter fragments,
# 5′ template-switching sequence, and reads shorter than 18 nt
cutadapt -u 3 input_R1.fastq.gz | \
cutadapt -a AAAAAAAA - | \
cutadapt -a AAAAAAAN$ -a AAAAAAN$ -a AAAAAN$ - | \
cutadapt -a AGAGCACACGTCTG - | \
cutadapt -O 8 -g GTTCAGAGTTCTACAGTCCGACGATCNNN - | \
cutadapt -m 18 -o output_R1.fastq.gz -
# Read 2
cutadapt -a CCCGATCGTCGG input_R2.fastq.gz | \
cutadapt -a GGGGATCGTCGG - | \
cutadapt -m 18 -o output_R2.fastq.gz -
Adapter sequences can vary by kit version, indexing strategy, read structure, protocol modification, and sequencing provider. Treat these examples as starting points and verify the final parameters for every project.
Batch trimming script
The script below processes paired-end FASTQ files named *_R1.fastq.gz and *_R2.fastq.gz, writes trimmed reads to a new folder, and stores Cutadapt reports.
#!/usr/bin/env bash
set -euo pipefail
INPUT_DIR="fastq"
OUTPUT_DIR="trimmed_fastq"
REPORT_DIR="cutadapt_reports"
THREADS=8
ADAPTER_R1="AGATCGGAAGAGCACACGTCT"
ADAPTER_R2="AGATCGGAAGAGCGTCGTGTA"
mkdir -p "$OUTPUT_DIR" "$REPORT_DIR"
for R1 in "$INPUT_DIR"/*_R1.fastq.gz; do
SAMPLE=$(basename "$R1" _R1.fastq.gz)
R2="$INPUT_DIR/${SAMPLE}_R2.fastq.gz"
cutadapt \
-j "$THREADS" \
-a "$ADAPTER_R1" \
-A "$ADAPTER_R2" \
-q 20 \
-m 18 \
-o "$OUTPUT_DIR/${SAMPLE}_R1.trimmed.fastq.gz" \
-p "$OUTPUT_DIR/${SAMPLE}_R2.trimmed.fastq.gz" \
"$R1" "$R2" \
> "$REPORT_DIR/${SAMPLE}.cutadapt.txt"
done
After trimming, run FastQC again and compare the pre- and post-trimming reports. In larger projects, aggregate QC output into a single project-level report.
Next steps after trimming
After trimming and filtering, reads can be used for downstream analysis. The correct next step depends on the sequencing assay.
Run FastQC on trimmed reads and compare results with the raw-read reports.
Proceed to read alignment, transcript quantification, variant calling, taxonomic classification, or another assay-specific workflow.
Document Cutadapt version, adapter sequences, quality cutoffs, minimum/maximum lengths, and all command-line parameters.
Keep both raw and processed FASTQ files, unless your project data-management policy specifies otherwise.
Privacy noticeWe process contact-form data only to respond to your enquiry. Please review our Privacy Policy for details.