2018
DOI: 10.1021/acs.jcim.8b00685
|View full text |Cite
|
Sign up to set email alerts
|

Comparative Study of Multitask Toxicity Modeling on a Broad Chemical Space

Abstract: Acute toxicity is one of the most challenging properties to predict purely with computational methods due to its direct relationship to biological interactions. Moreover, toxicity can be represented by different endpoints: it can be measured for different species using different types of administration, etc., and it is questionable if the knowledge transfer between endpoints is possible. We performed a comparative study of prediction multi-task toxicity for a broad chemical space using different descriptors an… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

5
79
1

Year Published

2018
2018
2023
2023

Publication Types

Select...
6
2
1

Relationship

3
6

Authors

Journals

citations
Cited by 82 publications
(85 citation statements)
references
References 54 publications
(67 reference statements)
5
79
1
Order By: Relevance
“…Moreover, it can be observed that the top three models of all the datasets were mainly occupied by the descriptorbased models (the ratio is 24/33=73%), substantiating the more powerful predictive abilities of the descriptor-based models compared to the graph-based models. Here what we found is that the graph-based models can outperform the descriptor-based models on some lager or multi-task datasets such as the HIV, Tox21 and ToxCast datasets, which is well accord with the previous conclusions where DNN excel at larger amounts of data and multi-task learning [65,66]. However, to build such generalizable and robust deep models requires large-scale high-quality datasets and the datasets in the practical drug discovery campaigns routinely suffer from narrow chemical diversity and insignificant sample sizes [67].…”
supporting
confidence: 91%
See 1 more Smart Citation
“…Moreover, it can be observed that the top three models of all the datasets were mainly occupied by the descriptorbased models (the ratio is 24/33=73%), substantiating the more powerful predictive abilities of the descriptor-based models compared to the graph-based models. Here what we found is that the graph-based models can outperform the descriptor-based models on some lager or multi-task datasets such as the HIV, Tox21 and ToxCast datasets, which is well accord with the previous conclusions where DNN excel at larger amounts of data and multi-task learning [65,66]. However, to build such generalizable and robust deep models requires large-scale high-quality datasets and the datasets in the practical drug discovery campaigns routinely suffer from narrow chemical diversity and insignificant sample sizes [67].…”
supporting
confidence: 91%
“…Numerous studies demonstrated that multi-task models have advantages over single-task models due to their ability to excavate the inconspicuous hidden relations between different subtasks and transparently share the learned features among all the tasks. [58,65,66] Nevertheless, the performance of multi-task models is highly related to the favorable correlations of individual tasks but such ready-to-use tasks are not so commonly seen in practical drug discovery campaigns.…”
Section: Performance Of Descriptor-based and Graph-based Modelsmentioning
confidence: 99%
“…Sml2canSml was added as CDDD descriptors to OCHEM. These descriptors were analysed by the same methods as used in the previous work, i.e., LibSVM [57], Random Forest [58], XGBoost [59] as well as by Associative Neural Networks (ASNN) [60] and Deep Neural Networks [61]. Exactly the same protocol, fivefold cross-validation, was used for all calculations.…”
Section: Qsar Modelingmentioning
confidence: 99%
“…Their detailed description can be found elsewhere [4]. Associative Neural Networks (ASNN) [11], Deep Neural Network (DNN) [12], Extreme Gradient Boost (XGBOOST) [13], and Least Squares Support Vector Machine (LSSVM) [14] algorithms were analyzed for training the models. The methods were used with default parameters as specified on the OCHEM web site.…”
Section: Methodsmentioning
confidence: 99%