2021
DOI: 10.1101/2021.02.05.429957
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Large-scale tandem mass spectrum clustering using fast nearest neighbor searching

Abstract: Rationale: Advanced algorithmic solutions are necessary to process the ever increasing amounts of mass spectrometry data that is being generated. Here we describe the falcon spectrum clustering tool for efficient clustering of millions of MS/MS spectra. Methods: falcon succeeds in efficiently clustering large amounts of mass spectral data using advanced techniques for fast spectrum similarity searching. First, high-resolution spectra are binned and converted to low-dimensional vectors using feature hashing. Ne… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
1
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
2
2
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(1 citation statement)
references
References 41 publications
(41 reference statements)
0
1
0
Order By: Relevance
“…Compared with IMBR, spectral clustering is less sensitive to the issue of overlapping MS1 isotope patterns as the transfers of identifications are based on MS2 spectrum similarity rather than similar retention times and mass-to-charge ratios only. MaRaCluster (© Matthew The) ( 11 ) is one such spectrum clustering tool that showed competitive performance over others ( 16 , 17 ) and can also be used for TMT data. However, MaRaCluster has not yet been able to combine data from several TMT batches for the purpose of reducing missing quantification values.…”
mentioning
confidence: 99%
“…Compared with IMBR, spectral clustering is less sensitive to the issue of overlapping MS1 isotope patterns as the transfers of identifications are based on MS2 spectrum similarity rather than similar retention times and mass-to-charge ratios only. MaRaCluster (© Matthew The) ( 11 ) is one such spectrum clustering tool that showed competitive performance over others ( 16 , 17 ) and can also be used for TMT data. However, MaRaCluster has not yet been able to combine data from several TMT batches for the purpose of reducing missing quantification values.…”
mentioning
confidence: 99%