Optical genome mapping identifies a novel pediatric embryonal tumor with a <i>ZNF532::NUTM1</i> fusion

Bornhorst, Miriam; Eze, Augustine; Bhattacharya, Surajit; Putnam, Ethan; Almira-Suarez, M. Isabel; Rossi, Christopher T.; Kambhampati, Madhuri; Almalvez, Miguel; Barseghyan, Mariam; Risco, Nicole Del; Dotson, David; Turner, Joyce; Myseros, John S.; Vilain, Éric; Packer, Roger J.; Nazarian, Javad; Rood, Brian R.; Barseghyan, Hayk

doi:10.1002/path.6085

Cited by 4 publications

(1 citation statement)

References 40 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Optical genome mapping (developed by Bionano Genomics, San Diego, CA, USA) excels at detecting structural variants (SV), such as balanced translocations and deletions/insertions in the 1 kb to 1 Mb range, and its clinical utility was demonstrated in Duchenne (DMD) or facioscapulohumeral (FSHD) muscular dystrophies ( Barseghyan et al, 2017 ; Sharim et al, 2019 ), and cancer ( Neveling et al, 2021 ; Talsania et al., 2022 ; Bornhorst et al, 2023 ). But long-read sequence (LRS), developed by Pacific Biosciences (Menlo Park, CA, USA) and Oxford Nanopore Technologies (Oxford, UK) among others, has been shown to be the most appropriate technology to detect smaller variants in the 50 bp-1 kb range ( Chaisson et al, 2019 ).…”

Section: Introductionmentioning

confidence: 99%

Benchmarking long-read genome sequence alignment tools for human genomics applications

LoTempio,

Delot,

Vilain

2023

PeerJ

Self Cite

View full text Add to dashboard Cite

Background The utility of long-read genome sequencing platforms has been shown in many fields including whole genome assembly, metagenomics, and amplicon sequencing. Less clear is the applicability of long reads to reference-guided human genomics, which is the foundation of genomic medicine. Here, we benchmark available platform-agnostic alignment tools on datasets from nanopore and single-molecule real-time platforms to understand their suitability in producing a genome representation. Results For this study, we leveraged publicly-available data from sample NA12878 generated on Oxford Nanopore and sample NA24385 on Pacific Biosciences platforms. We employed state of the art sequence alignment tools including GraphMap2, long-read aligner (LRA), Minimap2, CoNvex Gap-cost alignMents for Long Reads (NGMLR), and Winnowmap2. Minimap2 and Winnowmap2 were computationally lightweight enough for use at scale, while GraphMap2 was not. NGMLR took a long time and required many resources, but produced alignments each time. LRA was fast, but only worked on Pacific Biosciences data. Each tool widely disagreed on which reads to leave unaligned, affecting the end genome coverage and the number of discoverable breakpoints. No alignment tool independently resolved all large structural variants (1,001–100,000 base pairs) present in the Database of Genome Variants (DGV) for sample NA12878 or the truthset for NA24385. Conclusions These results suggest a combined approach is needed for LRS alignments for human genomics. Specifically, leveraging alignments from three tools will be more effective in generating a complete picture of genomic variability. It should be best practice to use an analysis pipeline that generates alignments with both Minimap2 and Winnowmap2 as they are lightweight and yield different views of the genome. Depending on the question at hand, the data available, and the time constraints, NGMLR and LRA are good options for a third tool. If computational resources and time are not a factor for a given case or experiment, NGMLR will provide another view, and another chance to resolve a case. LRA, while fast, did not work on the nanopore data for our cluster, but PacBio results were promising in that those computations completed faster than Minimap2. Due to its significant burden on computational resources and slow run time, Graphmap2 is not an ideal tool for exploration of a whole human genome generated on a long-read sequencing platform.

show abstract