Fast k-Means Based on k-NN Graph

Deng, Chenghao; Zhao, Wan‐Lei

doi:10.1109/icde.2018.00115

Cited by 15 publications

(2 citation statements)

References 36 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For example, nonlinear SVM can be capable of dealing with high-dimensional data but may not be robust to the presence of diverse chemical descriptors. 17 Deng and Zhao 18 reported that the computational cost of KNN increases exponentially with the size of the input samples. Recently, deep learning (DL) has attracted much attention for predicting the outcome of biological assays and becomes a key candidate for toxicity prediction due to its ability to bypass feature extraction.…”

Section: ■ Introductionmentioning

confidence: 99%

GGL-Tox: Geometric Graph Learning for Toxicity Prediction

Jian

Wang

Wei

2021

J. Chem. Inf. Model.

View full text Add to dashboard Cite

Toxicity analysis is a major challenge in drug design and discovery. Recently significant progress has been made through machine learning due to its accuracy, efficiency, and lower cost. US Toxicology in the 21st Century (Tox21) screened a large library of compounds, including approximately 12 000 environmental chemicals and drugs, for different mechanisms responsible for eliciting toxic effects. The Tox21 Data Challenge offered a platform to evaluate different computational methods for toxicity predictions. Inspired by the success of multiscale weighted colored graph (MWCG) theory in protein−ligand binding affinity predictions, we consider MWCG theory for toxicity analysis. In the present work, we develop a geometric graph learning toxicity (GGL-Tox) model by integrating MWCG features and the gradient boosting decision tree (GBDT) algorithm. The benchmark tests of the Tox21 Data Challenge are employed to demonstrate the utility and usefulness of the proposed GGL-Tox model. An extensive comparison with other state-of-the-art models indicates that GGL-Tox is an accurate and efficient model for toxicity analysis and prediction.

show abstract

Section: ■ Introductionmentioning

confidence: 99%

GGL-Tox: Geometric Graph Learning for Toxicity Prediction

Jian

Wang

Wei

2021

J. Chem. Inf. Model.

View full text Add to dashboard Cite

show abstract

“…These models perform relatively better on smaller data sets with fewer preselected features. One key limitation of KNN algorithm is the exponential rise of computational cost with the size of the input samples. − In contrast, nonlinear SVMs can manage high dimensional data but do not exhibit sufficiently robust performance on diverse chemical descriptors …”

Section: Introductionmentioning

confidence: 99%

Efficient Toxicity Prediction via Simple Features Using Shallow Neural Networks and Decision Trees

et al. 2019

View full text Add to dashboard Cite

Toxicity prediction of chemical compounds is a grand challenge. Lately, it achieved significant progress in accuracy but using a huge set of features, implementing a complex blackbox technique such as a deep neural network, and exploiting enormous computational resources. In this paper, we strongly argue for the models and methods that are simple in machine learning characteristics, efficient in computing resource usage, and powerful to achieve very high accuracy levels. To demonstrate this, we develop a single task-based chemical toxicity prediction framework using only 2D features that are less compute intensive. We effectively use a decision tree to obtain an optimum number of features from a collection of thousands of them. We use a shallow neural network and jointly optimize it with decision tree taking both network parameters and input features into account. Our model needs only a minute on a single CPU for its training while existing methods using deep neural networks need about 10 min on NVidia Tesla K40 GPU. However, we obtain similar or better performance on several toxicity benchmark tasks. We also develop a cumulative feature ranking method which enables us to identify features that can help chemists perform prescreening of toxic compounds effectively.

show abstract

A Fast Heuristic k-means Algorithm Based on Nearest Neighbor Information

Wang

Wen

2022

3D Imaging—Multidimensional Signal Processing and Deep Learning

View full text Add to dashboard Cite

Fast k-Means Based on k-NN Graph

Cited by 15 publications

References 36 publications

GGL-Tox: Geometric Graph Learning for Toxicity Prediction

GGL-Tox: Geometric Graph Learning for Toxicity Prediction

Efficient Toxicity Prediction via Simple Features Using Shallow Neural Networks and Decision Trees

A Fast Heuristic k-means Algorithm Based on Nearest Neighbor Information

Contact Info

Product

Resources

About