Prediction of drug solubility is a crucial problem in pharmaceutical industries for both drug delivery and discovery purposes. Several theoretical approaches have been proposed to predict drug solubility in mixed solvent systems when the solubility values in pure solvents are known. Quantitative structure property relationship (QSPR) approaches are gaining attention to predict various physical properties due to their robustness and computational tractability. In this work, a machine learning based QSPR approach is proposed to predict drug solubility in binary solvent systems using structural features, such as molar refractivity, McGowan volume, topological surface area, and so forth. A genetic algorithm based feature selection procedure is used to check the dependency between the selected features and to obtain the final set of significant features. Initially, solubility is assumed to behave linearly with respect to the structural features and model parameters are estimated using ordinary least-squares and a weight-based optimization approach. Later, solubility is assumed to be piecewise linear with respect to structural features and multiple model (MM) parameters are identified using a machine learning approach, which is a prediction error based clustering approach. The efficacy of proposed approaches is demonstrated on drug solubility data collected from literature. To compare the efficiency of the proposed MM approach, a neural network based nonlinear model with different configurations using a Levenberg–Marquardt training algorithm has been tested. A novel testing strategy is also proposed to identify a suitable model for a test sample when model parameters are obtained using a prediction error based clustering approach.
Model building and parameter estimation are traditional concepts widely used in chemical, biological, metallurgical, and manufacturing industries. Early modeling methodologies focused on mathematically capturing the process knowledge and domain expertise of the modeler. The models thus developed are termed first principles models (or white‐box models). Over time, computational power became cheaper, and massive amounts of data became available for modeling. This led to the development of cutting edge machine learning models (black‐box models) and artificial intelligence (AI) techniques. Hybrid models (gray‐box models) are a combination of first principles and machine learning models. The development of hybrid models has captured the attention of researchers as this combines the best of both modeling paradigms. Recent attention to this field stems from the interest in explainable AI (XAI), a critical requirement as AI systems become more pervasive. This work aims at identifying and categorizing various hybrid models available in the literature that integrate machine‐learning models with different forms of domain knowledge. Benefits such as enhanced predictive power, extrapolation capabilities, and other advantages of combining the two approaches are summarized. The goal of this article is to consolidate the published corpus in the area of hybrid modeling and develop a comprehensive framework to understand the various techniques presented. This framework can further be used as the foundation to explore rational associations between several models.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.