The widespread use of mass spectrometry for protein identification has created an urgent demand for improving computational efficiency of matching mass spectrometry data to protein databases. With the rapid development of chip technology and parallel computing technique, such as multicore processor, many-core coprocessor and cluster of multinode, the speed and performance of the major mass spectral search engines are continuously improving. In recent ten years, X!Tandem as a popular and representative open-source program in searching mass spectral has extended several parallel versions and obtains considerable speedups. However, because these parallel strategies are mainly based on cluster of nodes, higher costs (e.g., charge of electricity and maintenance) is needed to get limited speedups. Fortunately, Intel Many Integrated Core (MIC) architecture and Graphics Processing Unit (GPU) are ideal for this problem. In this paper, we present and implement a parallel strategy to X!Tandem using MIC called MIC-Tandem; That shows excellent speedups on commodity hardware and produces the same results as the original program.
With the dramatic increase of available data, the process of data processing should get higher and higher performance. Most researches on k-Nearest Neighbor (kNN) query algorithm are based on the regular partitioning method which is easy to cause the imbalance of load, even influence the overall performance of the kNN query algorithm. In addition, the traditional kNN query algorithm works on single process or single machine platforms, which cannot obtain high enough efficiency when dealing with big data. Aiming at these two issues, an irregular partitioning method based kNN algorithm is presented and being executed on the distributed parallel computing platform-MapReduce as of in this paper. Experiments show that the irregular partitioning method based kNN algorithm using MapReduce can obtain much higher performance and can guarantee a very efficient query when dealing with big data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.