2021
DOI: 10.1038/s41467-021-24150-4
|View full text |Cite
|
Sign up to set email alerts
|

Bioactivity descriptors for uncharacterized chemical compounds

Abstract: Chemical descriptors encode the physicochemical and structural properties of small molecules, and they are at the core of chemoinformatics. The broad release of bioactivity data has prompted enriched representations of compounds, reaching beyond chemical structures and capturing their known biological properties. Unfortunately, bioactivity descriptors are not available for most small molecules, which limits their applicability to a few thousand well characterized compounds. Here we present a collection of deep… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
46
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
6
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 59 publications
(53 citation statements)
references
References 44 publications
0
46
0
Order By: Relevance
“…To obtain the best performing models, we tried three different feature extraction methods i.e. bioactivity-based descriptors (Signaturizer library) 46 , chemistry-based molecular descriptors (Mordred software) 47 , and graph-based features (DeepChem library) 48 . In addition to these diversified features, we also tried multiple machine learning/deep learning-based classification algorithms for model building such as Random Forest (RF), Multilayer Perceptron (MLP), k-Nearest Neighbor (KNN), Support Vector Machine (SVM), Stochastic Gradient Descent (SGD), Logistic Regression (LR), GraphConvModel (GCM), Attentive FP (AFP), Graph Convolution Network (GCN), and Graph Attention Network (GAT) (Supplementary Figure 1b) .…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…To obtain the best performing models, we tried three different feature extraction methods i.e. bioactivity-based descriptors (Signaturizer library) 46 , chemistry-based molecular descriptors (Mordred software) 47 , and graph-based features (DeepChem library) 48 . In addition to these diversified features, we also tried multiple machine learning/deep learning-based classification algorithms for model building such as Random Forest (RF), Multilayer Perceptron (MLP), k-Nearest Neighbor (KNN), Support Vector Machine (SVM), Stochastic Gradient Descent (SGD), Logistic Regression (LR), GraphConvModel (GCM), Attentive FP (AFP), Graph Convolution Network (GCN), and Graph Attention Network (GAT) (Supplementary Figure 1b) .…”
Section: Resultsmentioning
confidence: 99%
“…To obtain the best performing models, we tried three different feature extraction methods i.e. bioactivity-based descriptors (Signaturizer library) 46 , chemistry-based molecular descriptors (Mordred software) 47 , and graph-based features (DeepChem library) 48 . In addition to these diversified features, we also tried 50 , implementation of an array of classifiers for model building, and testing of all models on the same unseen testing dataset.…”
Section: Carcinogenicity Predictionmentioning
confidence: 99%
“…The RDMD contained 200 molecular descriptors, including physicochemical properties and structure characteristics, which have been used in many studies and have achieved satisfactory results [ 23 , 24 , 25 , 26 , 27 ]. The CCMD is a novel type of biological descriptor containing 25 various bioactive spaces [ 28 ]. Moreover, the simple representation of CCMD is compatible with different types of computational tools in a multi-dimensional form.…”
Section: Methodsmentioning
confidence: 99%
“…Accordingly, graph-based DNNs including message passing networks have increasingly been investigated for learning model-internal representations from molecular structure (Chuang et al, 2020). In addition to graph-based representation learning, DL has recently also been applied to predict biological signatures of test compounds (Bertoni et al, 2021), which might be combined with standard structural descriptors in virtual screening (vide supra). However, on the basis of currently available data, it remains to be determined whether alternative molecular representations-be they learned from graphs or predicted-might yield higher performance in ML and other applications than long-used standards such as molecular fingerprints or numerical descriptors.…”
Section: Deep Neural Networkmentioning
confidence: 99%