In the current study, we propose a new quantitative read-across methodology for predicting the toxicity of newly synthesized NPs based on the similarity with structural analogues.
In this study, the specific surface area of various perovskites was modeled using a novel quantitative read-across structure-property relationship (q-RASPR) approach, which clubs both Read-Across (RA) and quantitative structure-property relationship (QSPR) together. After optimization of the hyper-parameters, certain similarity-based error measures for each query compound were obtained. Clubbing some of these error-based measures with the previously selected features along with the Read-Across prediction function, a number of machine learning models were developed using Partial Least Squares (PLS), ridge regression (RR), linear support vector regression (LSVR), and random forest (RF) regression. Based on the external prediction quality and interpretability, the PLS model was selected as the best predictor which underscored the previously reported results. The finally selected model should efficiently predict specific surface areas of other perovskites for their use in photocatalysis. The new q-RASPR method also appears promising for the prediction of several other property endpoints of interest in materials science.
Quantitative structure–activity
relationship (QSAR) modeling
is a well-known in silico technique with extensive
applications in several major fields such as drug design, predictive
toxicology, materials science, food science, etc. Handling small-sized
datasets due to the lack of experimental data for specialized end
points is a crucial task for the QSAR researcher. In the present study,
we propose an integrated workflow/scheme capable of dealing with small
dataset modeling that integrates dataset curation, “exhaustive”
double cross-validation and a set of optimal model selection techniques
including consensus predictions. We have developed two software tools,
namely, Small Dataset Curator, version 1.0.0, and Small Dataset Modeler, version 1.0.0, to effortlessly execute the proposed workflow. These tools are
freely available for download from . We have performed case studies employing seven diverse datasets
to demonstrate the performance of the proposed scheme (including data
curation) for small dataset QSAR modeling. The case studies also confirm
the usability and stability of the developed software tools.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.