Accelerating genomic workflows using NVIDIA Parabricks

O’Connell, Kyle A.; Yosufzai, Zelaikha B; Campbell, Ross A; Lobb, Collin J.; Engelken, Haley T; Gorrell, Laura; Carlson, Thaddeus B; Catana, Josh J; Mikdadi, Dina; Bonazzi, Vivien; Klenk, Juergen

doi:10.1101/2022.07.20.498972

Cited by 4 publications

(4 citation statements)

References 27 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…An updated Nextflow pipeline compatible with latest version tools as available in March 2024 is also made available. In this pipeline, GPUsupported variant calling is performed using Clara Parabricks 44 . The pipelines are made available on Github.…”

Section: Discussionmentioning

confidence: 99%

Versatile, accessible cross-platform molecular profiling of central nervous system tumors: web-based, prospective multi-center validation

Sahm,

Patel,

Göbel

et al. 2024

Preprint

View full text Add to dashboard Cite

The 2021 WHO classification underscores the importance of molecular data integration in Central Nervous System (CNS) tumor diagnostics. However, currently used assays have disadvantages due to technical complexity, required equipment and reagent cost, as well as lengthy turnaround times. In response to these challenges, we introduce Rapid-CNS2 and MNP-Flex. Rapid-CNS2, an adaptive sampling-based nanopore sequencing workflow, offers real-time methylation classification and DNA copy-number information within a 30-minute window, suitable for intra-operative settings, followed by comprehensive molecular profiling within 24 h, covering the complete spectrum of diagnostically and therapeutically relevant information for the respective entity. We have prospectively validated Rapid-CNS2 in a multi-center setting on 223 samples. For even more widespread use of methylation-based CNS tumor classification, we developed MNP-Flex, a platform-agnostic methylation classifier encompassing 184 CNS tumor classes. MNP-flex achieved 92% accuracy across a global validation cohort of 78,000 samples spanning five different technologies. These innovations represent a significant advancement in CNS tumor diagnostics, making rapid, actionable molecular insights more widely and more rapidly available, which is crucial for personalized treatment strategies. Their integration streamlines the diagnostic process, broadening access to accurate molecular classification and promising improved patient outcomes in neurooncology on a global scale. MNP-Flex is available as a web-service https://mnp-flex.org and the Rapid-CNS2 workflow is available on Github.

show abstract

Section: Discussionmentioning

confidence: 99%

Versatile, accessible cross-platform molecular profiling of central nervous system tumors: web-based, prospective multi-center validation

Sahm,

Patel,

Göbel

et al. 2024

Preprint

View full text Add to dashboard Cite

show abstract

“…2020a). Alternative approaches for data preprocessing have been developed, such as the GPU‐based NVIDIA Parabricks (O'Connell, Yosufzai, and Campbell 2022). Another major hardware improvement is related to the CPU‐based systems.…”

Section: Discussionmentioning

confidence: 99%

“…For example, while the preprocessing of a single WGS sample from the FASTQ file to the GVCF took several hours on a well-equipped CPU-based system a decade ago, it requires <30 min on an FPGA-based computer (Betschart et al 2022;Zhao et al 2020a). Alternative approaches for data preprocessing have been developed, such as the GPU-based NVIDIA Parabricks (O'Connell, Yosufzai, and Campbell 2022). Another major hardware improvement is related to the CPU-based systems.…”

Section: Figure 18mentioning

confidence: 99%

Biostatistical Aspects of Whole Genome Sequencing Studies: Preprocessing and Quality Control

Betschart,

Riccio,

Aguilera‐Garcia

et al. 2024

Biometrical J

View full text Add to dashboard Cite

Rapid advances in high‐throughput DNA sequencing technologies have enabled large‐scale whole genome sequencing (WGS) studies. Before performing association analysis between phenotypes and genotypes, preprocessing and quality control (QC) of the raw sequence data need to be performed. Because many biostatisticians have not been working with WGS data so far, we first sketch Illumina's short‐read sequencing technology. Second, we explain the general preprocessing pipeline for WGS studies. Third, we provide an overview of important QC metrics, which are applied to WGS data: on the raw data, after mapping and alignment, after variant calling, and after multisample variant calling. Fourth, we illustrate the QC with the data from the GENEtic SequencIng Study Hamburg–Davos (GENESIS‐HD), a study involving more than 9000 human whole genomes. All samples were sequenced on an Illumina NovaSeq 6000 with an average coverage of 35× using a PCR‐free protocol. For QC, one genome in a bottle (GIAB) trio was sequenced in four replicates, and one GIAB sample was successfully sequenced 70 times in different runs. Fifth, we provide empirical data on the compression of raw data using the DRAGEN original read archive (ORA). The most important quality metrics in the application were genetic similarity, sample cross‐contamination, deviations from the expected Het/Hom ratio, relatedness, and coverage. The compression ratio of the raw files using DRAGEN ORA was 5.6:1, and compression time was linear by genome coverage. In summary, the preprocessing, joint calling, and QC of large WGS studies are feasible within a reasonable time, and efficient QC procedures are readily available.

show abstract

“…For germline callers, the author observed speedups of up to 65× (GATK haplotype caller). Alternatively, somatic variant callers achieved speedups of up to 56.8× (Mutect2 algorithm) [ 81 ]. For emergency use for hospitalized patients, Clark et al built a pipeline based on the DRAGEN platform to analyze genome sequencing data.…”

Section: The Nvidia Gpus Dragen Fpgas Systems and Ai Medical Cloud Pl...mentioning

confidence: 99%

Cutting-Edge AI Technologies Meet Precision Medicine to Improve Cancer Care

et al. 2022

View full text Add to dashboard Cite

To provide precision medicine for better cancer care, researchers must work on clinical patient data, such as electronic medical records, physiological measurements, biochemistry, computerized tomography scans, digital pathology, and the genetic landscape of cancer tissue. To interpret big biodata in cancer genomics, an operational flow based on artificial intelligence (AI) models and medical management platforms with high-performance computing must be set up for precision cancer genomics in clinical practice. To work in the fast-evolving fields of patient care, clinical diagnostics, and therapeutic services, clinicians must understand the fundamentals of the AI tool approach. Therefore, the present article covers the following four themes: (i) computational prediction of pathogenic variants of cancer susceptibility genes; (ii) AI model for mutational analysis; (iii) single-cell genomics and computational biology; (iv) text mining for identifying gene targets in cancer; and (v) the NVIDIA graphics processing units, DRAGEN field programmable gate arrays systems and AI medical cloud platforms in clinical next-generation sequencing laboratories. Based on AI medical platforms and visualization, large amounts of clinical biodata can be rapidly copied and understood using an AI pipeline. The use of innovative AI technologies can deliver more accurate and rapid cancer therapy targets.

show abstract

Accelerating genomic workflows using NVIDIA Parabricks

Cited by 4 publications

References 27 publications

Versatile, accessible cross-platform molecular profiling of central nervous system tumors: web-based, prospective multi-center validation

Versatile, accessible cross-platform molecular profiling of central nervous system tumors: web-based, prospective multi-center validation

Biostatistical Aspects of Whole Genome Sequencing Studies: Preprocessing and Quality Control

Cutting-Edge AI Technologies Meet Precision Medicine to Improve Cancer Care

Contact Info

Product

Resources

About