2020
DOI: 10.48550/arxiv.2009.14123
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Communication Lower-Bounds for Distributed-Memory Computations for Mass Spectrometry based Omics Data

Abstract: Mass spectrometry based omics data analysis require significant time and resources. To date, few parallel algorithms have been proposed for deducing peptides from mass spectrometry based data. However, these parallel algorithms were designed, and developed when the amount of data that needed to be processed was smaller in scale. In this paper, we prove that the communication bound that is reached by the existing parallel algorithms is Ω(mn + 2r q p ), where m and n are the dimensions of the theoretical databas… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
1

Relationship

1
0

Authors

Journals

citations
Cited by 1 publication
(2 citation statements)
references
References 16 publications
0
2
0
Order By: Relevance
“…The model-spectra database can grow exponentially in space (several giga to terabytes) as the post-translational modifications (PTMs) are incorporated in simulation [2], [21]. Therefore, the cost of moving, and managing this data to match with the spectra now exceeds the costs of doing the arithmetic operations in these search engines leading to non-scalable workflows with increasingly larger, and complex data sets [22].…”
Section: Mainmentioning
confidence: 99%
See 1 more Smart Citation
“…The model-spectra database can grow exponentially in space (several giga to terabytes) as the post-translational modifications (PTMs) are incorporated in simulation [2], [21]. Therefore, the cost of moving, and managing this data to match with the spectra now exceeds the costs of doing the arithmetic operations in these search engines leading to non-scalable workflows with increasingly larger, and complex data sets [22].…”
Section: Mainmentioning
confidence: 99%
“…However, computationally optimal HPC algorithms that minimize both the computational and communications costs for these tasks are still needed. Urgent need for developing methods that exhibit optimal performance is illustrated in our theoretical framework [22], and can potentially lead to large-scale systems biology studies especially for meta-proteomics, proteogenomic, and MS based microbiome or non-model organisms' studies having direct impact on personalized nutrition, microbiome research, and cancer therapeutics.…”
Section: Mainmentioning
confidence: 99%