2021
DOI: 10.3390/biom11121793
|View full text |Cite
|
Sign up to set email alerts
|

MassGenie: A Transformer-Based Deep Learning Method for Identifying Small Molecules from Their Mass Spectra

Abstract: The ‘inverse problem’ of mass spectrometric molecular identification (‘given a mass spectrum, calculate/predict the 2D structure of the molecule whence it came’) is largely unsolved, and is especially acute in metabolomics where many small molecules remain unidentified. This is largely because the number of experimentally available electrospray mass spectra of small molecules is quite limited. However, the forward problem (‘calculate a small molecule’s likely fragmentation and hence at least some of its mass s… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
44
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
1

Relationship

1
5

Authors

Journals

citations
Cited by 52 publications
(44 citation statements)
references
References 114 publications
0
44
0
Order By: Relevance
“…Our study highlighted several other compounds (not discussed here) that changed significantly in severity and outcome. However, as is common in metabolomic studies (Blaženović et al, 2018;Salek et al, 2013;Shrivastava et al, 2021), these require further rigorous identification following the Metabolomics Standards Initiative (MSI) for reporting metabolite identification (Sumner et al, 2007) and so were not included in the predictive model at this stage. Examples of such compounds include pentahomomethionine (MSI 3) and trihomomethionine (MSI 3), both sulphurcontaining amino acids.…”
Section: Compounds Requiring Further Investigationmentioning
confidence: 99%
“…Our study highlighted several other compounds (not discussed here) that changed significantly in severity and outcome. However, as is common in metabolomic studies (Blaženović et al, 2018;Salek et al, 2013;Shrivastava et al, 2021), these require further rigorous identification following the Metabolomics Standards Initiative (MSI) for reporting metabolite identification (Sumner et al, 2007) and so were not included in the predictive model at this stage. Examples of such compounds include pentahomomethionine (MSI 3) and trihomomethionine (MSI 3), both sulphurcontaining amino acids.…”
Section: Compounds Requiring Further Investigationmentioning
confidence: 99%
“…Here, mass fragmentation spectra (MS/MS spectra) acquired through data dependent acquisition (DDA) or data independent acquisition (DIA) alternatives have demonstrated their merits in adding structural information to metabolomics profiles, as we will also demonstrate throughout this review. As such, computational metabolomics tools that capitalize on MS and MS/MS information are a pragmatic solution since it is unlikely that we will ever cover the true chemical diversity in nature exhaustively with available reference standards given the vastness of estimated natural chemical space (Polishchuk et al, 2013 ; Shrivastava et al, 2021 ).…”
Section: Background and Motivationmentioning
confidence: 99%
“…This, of course, strongly relies on the quality of the generated data, i.e., how closely in-silico generated MS/MS spectra correspond to actual MS/MS spectra of the respective molecules. For instance, the usability of transformer-based DL architectures for doing mass spectral annotations was recently demonstrated with MassGenie (Shrivastava et al, 2021 ). To overcome the limitation of low amounts of metabolomics data, the authors of MassGenie used in-silico fragmentation to generate MS/MS spectra for about 6 million small molecules.…”
Section: Machine Learning For Metabolite Annotationmentioning
confidence: 99%
See 1 more Smart Citation
“…CSI:FingerID (Dührkop et al, 2015), COSMIC (M. A. Hoffmann et al, 2021), MassGenie (Shrivastava et al, 2021)).…”
Section: Spectral Libraries As a Source Of Machine Learning Training ...mentioning
confidence: 99%