: Despite the complementary strengths of short- and long-read sequencing approaches, variant-calling methods still rely on a single data type. In this study, we collected and harmonized Nanopore datasets of the seven healthy individuals in the GIAB project across three independent consortia. By leveraging these harmonized Nanopore data, we explore the benefits of using a hybrid DeepVariant model to jointly process Illumina and Nanopore data for germline variant detection. We show that a shallow hybrid long-short sequencing approach can match or surpass the germline variant detection accuracy of state-of-the-art single-technology methods, potentially reducing overall sequencing costs and enabling the detection of large germline structural variations. These findings hold great promise for molecular diagnostics in clinical settings, particularly for rare genetic disease screenings.

Joint processing of long- and short-read sequencing data with deep learning improves variant calling

Gambardella, Gennaro
2025-01-01

Abstract

: Despite the complementary strengths of short- and long-read sequencing approaches, variant-calling methods still rely on a single data type. In this study, we collected and harmonized Nanopore datasets of the seven healthy individuals in the GIAB project across three independent consortia. By leveraging these harmonized Nanopore data, we explore the benefits of using a hybrid DeepVariant model to jointly process Illumina and Nanopore data for germline variant detection. We show that a shallow hybrid long-short sequencing approach can match or surpass the germline variant detection accuracy of state-of-the-art single-technology methods, potentially reducing overall sequencing costs and enabling the detection of large germline structural variations. These findings hold great promise for molecular diagnostics in clinical settings, particularly for rare genetic disease screenings.
2025
CP: Computational biology
CP: Genetics
DeepVariant
GIAB
Illumina
Nanopore
deep learning
germline variants
hybrid variant calling
long reads
rare genetic disease
short read
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14246/1801
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact