Cancer genomes usually contain many somatic alterations, but only a subset contribute to tumour initiation, progression, metastasis or therapy resistance. Identifying driver mutations requires careful somatic variant calling, artefact filtering, annotation, biological prioritisation and, where possible, evidence from recurrence, function, pathways and clinical knowledgebases.
Driver-mutation identification starts with high-quality somatic variant detection and ends with evidence-based prioritisation. In practice, this means distinguishing true tumour alterations from germline variants and technical artefacts, annotating their functional consequences, checking whether they occur in known cancer genes or hotspots, evaluating tumour-type relevance and, where possible, integrating copy-number, structural-variant, fusion, RNA-seq and pathway evidence.
Practical rule: a candidate cancer driver should have one or more strong evidence signals: known oncogenic status, recurrence in the same tumour type, hotspot location, damaging loss-of-function in a tumour suppressor, activating event in an oncogene, copy-number or fusion evidence, pathway-level support, therapeutic relevance, or experimental/clinical support in curated databases.
Driver mutations vs passenger mutations
Tumours accumulate many mutations over time. Some are biologically important; many are passengers. The challenge is to identify alterations that contribute to cancer cell fitness, tumour biology or therapeutic response.
Driver mutationPromotes cancer initiation, progression, survival, invasion, metastasis, immune evasion or resistance. Examples include activating oncogene mutations, inactivating tumour-suppressor mutations, gene fusions and copy-number changes affecting cancer pathways.
Passenger mutationPresent in the tumour genome but not causally important for the cancer phenotype. Passenger mutations can be common in tumours with high mutation burden or defective repair pathways.
A variant can be a true somatic mutation without being a cancer driver. Somatic status and driver status are separate interpretation steps.
Data types used to identify cancer drivers
Data type
Driver evidence it can provide
Typical outputs
Matched tumour-normal WGS
Broadest DNA view SNVs, indels, CNVs, structural variants, rearrangements, mutational signatures and non-coding events.
Evidence signals used to prioritise cancer drivers
Evidence signal
How it supports driver status
Example interpretation
Known oncogenic hotspot
Recurrent mutation at a specific residue or domain suggests positive selection.
Activating hotspot in an oncogene is stronger evidence than a random missense variant.
Loss-of-function in tumour suppressor
Truncating mutation, frameshift, splice disruption or deletion may inactivate a tumour suppressor.
More convincing if accompanied by second hit, loss of heterozygosity or deletion.
Copy-number amplification
Focal high-level amplification can increase oncogene dosage.
More convincing when RNA expression of the amplified gene is also elevated.
Homozygous deletion
Deletion of both copies can inactivate a tumour suppressor.
Needs careful evaluation of purity, ploidy and segmentation quality.
Gene fusion
Can create an activated kinase, abnormal transcription factor or promoter-driven overexpression.
RNA-seq confirmation strengthens evidence for an expressed fusion transcript.
Pathway convergence
Multiple alterations in the same pathway can support biological relevance.
MAPK, PI3K, DNA-repair, cell-cycle or immune-evasion pathways are common examples.
Cohort recurrence
Genes or positions mutated more often than expected by background mutation processes are candidate drivers.
Useful in cohort studies, but must control for mutation rate, gene length and tumour type.
Curated evidence
Databases and literature can classify oncogenicity or clinical significance.
OncoKB, CIViC and COSMIC can support interpretation when used with tumour-type context.
Single-patient analysis vs cohort driver discovery
The strategy differs depending on whether the goal is to prioritise driver alterations in one tumour or discover driver genes across a cohort.
Single tumour or clinical-style report
Prioritise known oncogenic and actionable alterations.
Use tumour type, variant allele fraction, purity and copy number.
Prefer matched normal DNA where possible.
Integrate RNA-seq for expression, fusions and splicing evidence.
Report evidence level and limitations clearly.
Research cohort discovery
Run consistent calling across all samples.
Estimate background mutation rates and mutation signatures.
Test gene-level and position-level recurrence.
Consider pathway-level enrichment and mutual exclusivity.
Validate top candidates experimentally or in independent cohorts.
Annotation resources and knowledgebases
Driver identification is strengthened by curated resources, but database hits must be interpreted carefully. Some resources are gene-level, others are variant-level; some focus on biological oncogenicity, others on clinical actionability.
Resource
Useful for
Interpretation note
COSMIC / Cancer Gene Census
Known cancer genes, somatic mutation catalogues, cancer-gene curation.
Excellent for gene-level and mutation-context evidence, but not every variant in a Cancer Gene Census gene is automatically a driver.
OncoKB
Oncogenicity, biological effect, clinical actionability and evidence levels.
Useful for precision-oncology-style interpretation and therapy relevance.
CIViC
Open, evidence-based interpretations of cancer variants from the literature.
Useful for transparent, literature-supported clinical and biological variant interpretation.
ClinVar / ClinGen
Clinical variant assertions, especially germline interpretation.
Useful when hereditary predisposition or germline findings are relevant.
gnomAD and population databases
Filtering common germline variants.
Population frequency is not enough alone; artefact context and tumour biology also matter.
TCGA, ICGC and public cohorts
Tumour-type recurrence, subtype context and comparative analysis.
Useful for benchmarking a cohort or placing a tumour in broader cancer-genomic context.
Why DNA-seq alone may not be enough
DNA-seq identifies candidate alterations, but RNA-seq and other omics layers can reveal whether those alterations are expressed or functionally active. This is especially important for fusions, splice-site variants, copy-number amplifications, enhancer or promoter changes, and pathway activation.
DNA alterationMutation, deletion, amplification, rearrangement or fusion breakpoint.
RNA consequenceExpression, allele-specific expression, fusion transcript or abnormal splicing.
Functional contextPathway activity, tumour subtype, mutational signature or therapeutic relevance.
Example: a DNA-level fusion breakpoint is more convincing as a functional driver if RNA-seq confirms an expressed in-frame fusion transcript involving a known oncogenic domain.
Common pitfalls in driver-mutation identification
Calling every cancer-gene variant a driverGene-level cancer association is not equivalent to variant-level oncogenicity.
Ignoring matched normal DNATumour-only workflows can misclassify germline variants as somatic candidates.
Ignoring tumour purity and ploidyVariant allele fraction, copy number and clonality are difficult to interpret without purity and ploidy context.
Overtrusting prediction scoresIn silico pathogenicity predictions are helpful but insufficient for driver classification by themselves.
Missing non-SNV driversDrivers can be CNVs, structural variants, fusions, splice events, promoter mutations or epigenetic changes.
Ignoring tumour typeThe same alteration may have different biological or clinical meaning depending on cancer type and subtype.
AI integration for driver-mutation prioritisation
AI can support cancer driver analysis by combining structured annotations, curated databases, published literature, pathway knowledge, sequence context and multi-omics results. It can be especially useful for ranking candidate variants, summarising evidence and producing transparent review-ready reports.
Useful AI-supported tasks
Literature triage for candidate variants and genes.
Evidence summarisation from curated resources.
Prioritisation of variants by biological plausibility.
Integration of DNA-seq, RNA-seq and copy-number evidence.
Report drafting and review checklists.
Required safeguards
Use versioned databases and reproducible workflows.
Keep human scientific review in the loop.
Separate research interpretation from clinical claims.
Document evidence sources and uncertainty.
Validate important findings where needed.
How SciBerg can support driver-mutation analysis
SciBerg can help research and industrial partners identify and prioritise candidate driver mutations from NGS data using reproducible bioinformatics workflows and evidence-based interpretation.
FASTQ, BAM/CRAM, VCF and sample-metadata QC.
Somatic SNV and indel calling from tumour-normal or tumour-only data.
Copy-number, structural-variant and fusion analysis where data support it.
Variant annotation and prioritisation using cancer gene resources.
RNA-seq integration for expression, splicing and fusion validation.
Candidate driver tables with rationale and evidence categories.
AI-assisted literature and knowledgebase review with documented sources.
Transparent scripts, software versions, parameters and final reports.
Frequently asked questions
What is a driver mutation in cancer?
A driver mutation is a genomic alteration that gives a cancer cell a selective growth, survival, invasion, immune-evasion or treatment-resistance advantage. It contributes to cancer development or progression rather than simply being a by-product of genomic instability.
What is the difference between a driver and a passenger mutation?
A driver mutation contributes to the cancer phenotype, while a passenger mutation is present in the tumour genome but does not materially promote tumour growth or survival. Distinguishing them requires biological, statistical and clinical evidence.
Can a mutation be a driver in one cancer type but not another?
Yes. Driver status is context-dependent. The same alteration may be strongly oncogenic in one tumour type, weakly relevant in another, or clinically irrelevant if the affected pathway is not active in that cellular context.
Is every mutation in a known cancer gene a driver?
No. A gene may be a known cancer gene, but not every variant in that gene is oncogenic. Variant-level interpretation is essential, especially for missense variants and variants of uncertain significance.
Can tumour-only sequencing identify driver mutations?
Tumour-only sequencing can identify candidate driver alterations, but matched normal DNA is preferred because it helps distinguish somatic mutations from germline variants and technical artefacts.
How can RNA-seq help identify driver mutations?
RNA-seq can show whether a mutated gene is expressed, detect expressed fusion transcripts, reveal splicing consequences, and provide pathway-level evidence that complements DNA-seq.
Can AI identify cancer driver mutations?
AI can help prioritise candidate drivers by integrating sequence context, variant annotations, databases, literature, pathway information and multi-omics data. However, AI output should be reviewed, documented and validated rather than treated as final evidence.
Selected resources and documentation
The following resources are useful starting points for cancer-driver annotation, somatic-variant calling and evidence-based cancer-variant interpretation.