2023
DOI: 10.1038/s41467-023-39279-7
|View full text |Cite
|
Sign up to set email alerts
|

Ultra-fast and accurate electron ionization mass spectrum matching for compound identification with million-scale in-silico library

Abstract: Spectrum matching is the most common method for compound identification in mass spectrometry (MS). However, some challenges limit its efficiency, including the coverage of spectral libraries, the accuracy, and the speed of matching. In this study, a million-scale in-silico EI-MS library is established. Furthermore, an ultra-fast and accurate spectrum matching (FastEI) method is proposed to substantially improve accuracy using Word2vec spectral embedding and boost the speed using the hierarchical navigable smal… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
11
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
8
1

Relationship

1
8

Authors

Journals

citations
Cited by 22 publications
(11 citation statements)
references
References 52 publications
0
11
0
Order By: Relevance
“…Thereby, Yang et al proposed a library matching method, FastEI, by integrating Spec2vec and graph-based search. 100 In this method, Spec2vec was introduced to extract the features of characteristic m/z peaks, and Hierarchical Navigable Small-world Graph (HNSW) index, a graph-based search algorithm, was employed to realize the approximate nearest neighbor search in the spectral database. This search algorithm significantly improved search speed, by avoiding the calculation of similarity between the to-be-determined spectrum and each spectrum in the library one by one.…”
Section: ■ Molecular Structure To Spectrummentioning
confidence: 99%
“…Thereby, Yang et al proposed a library matching method, FastEI, by integrating Spec2vec and graph-based search. 100 In this method, Spec2vec was introduced to extract the features of characteristic m/z peaks, and Hierarchical Navigable Small-world Graph (HNSW) index, a graph-based search algorithm, was employed to realize the approximate nearest neighbor search in the spectral database. This search algorithm significantly improved search speed, by avoiding the calculation of similarity between the to-be-determined spectrum and each spectrum in the library one by one.…”
Section: ■ Molecular Structure To Spectrummentioning
confidence: 99%
“…One compound may generate distinct MS/MS spectra under different collision energies, , dissociation strategies, and mass tolerance windows . On the other hand, compounds having very different structures may also generate similar MS/MS spectra. , To better establish linkages among compounds, it is necessary to complement existing molecular networks with rules independent of MS spectra.…”
Section: Introductionmentioning
confidence: 99%
“…Directed message passing neural networks (D-MPNN) operate on molecular graphs based on chemical structures in the simplified molecular input line entry system (SMILES) notation to build neural representations of molecules for property prediction including mass spectra. Notably, a recent study by Yang et al addresses limitations in reference library coverage by predicting additional missing mass spectra with known structures using NEIMS.…”
Section: Introductionmentioning
confidence: 99%