Mohamed Malhat scite author profile

The availability of chemical libraries with millions of compounds makes the process of identifying lead compounds very hard. The identification of these compounds is the backbone step of drug discovery process. Hierarchical clustering algorithms are used for that purpose. One of the most popular hierarchical clustering algorithms that are used in many applications in the drug discovery process is ward clustering algorithm. A main problem with the previous implementations of ward algorithm is its limitation to handle large data sets within a reasonable time and memory resources. In this paper, OpenCL is used to implement ward algorithm. The first two steps of ward (1) proximity matrix computation; (2) finding minimum distance are modified to run in parallel. Four subsets of National Cancer Institute (NCI) dataset are used. The smallest subset contains 500 compounds and largest subset contains 10,000 compounds. The results show that parallel proximity matrix computation saves 92% of time for smallest subset and 99% of time for largest subset. The parallel minimum distance saves 76% of time for smallest subset and 99% of time for largest subset.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Mohamed Malhat

Improving instance selection methods for big data classification

Clustering of chemical data sets for drug discovery

A new approach for instance selection: Algorithms, evaluation, and comparisons

Parallel Ward Clustering for Chemical Compounds Using MapReduce

Parallel ward clustering for chemical compounds using OpenCL

Contact Info

Product

Resources

About