Novel protein discovery and immunopeptidomics depend on highly sensitive de novo peptide sequencing with tandem mass spectrometry. Despite significant improvement using deep learning models, the missing fragmentation problem remains an important hurdle that significantly degrades the performance of de novo peptide sequencing. In this paper, we reveal that in the process of peptide prediction, missing fragmentation results in generating incorrect amino acids within those regions and causes error accumulation thereafter. To tackle this problem, we propose GraphNovo, a two-stage de novo peptide sequencing algorithm based on a graph neural network. GraphNovo focuses on finding the optimal path in the first stage to guide the sequence prediction in the second stage. Our experiments demonstrate that GraphNovo mitigates the effects of missing fragmentation and outperforms the state-of-the-art de novo peptide sequencing algorithms.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.