In this paper, structure-activity relationship (SAR, classification) and quantitative structure-activity relationship (QSAR) models have been established to predict the bioactivity of human epidermal growth factor receptor-2 (HER2) inhibitors. For the SAR study, we established six SAR (or classification) models to distinguish highly and weakly active HER2 inhibitors. The dataset contained 868 HER2 inhibitors, which was split into a training set including 580 inhibitors and a test set including 288 inhibitors by a Kohonen's self-organizing map (SOM), or a random method. The SAR models were performed using support vector machine (SVM), random forest (RF) and multilayer perceptron (MLP) methods. Among the six models, SVM models obtained superior results compared with other models. The prediction accuracy of the best model (model 1A) was 90.27% and the Matthews correlation coefficient (MCC) was 0.80 on the test set. For the QSAR study, we chose 286 HER2 inhibitors to establish six quantitative prediction models using MLR, SVM and MLP methods. The correlation coefficient (r) of the best model (model 4B) was 0.92 on the test set. The descriptors analysis showed that HAccN, lone pair electronegativity and π electronegativity were closely related to the bioactivity of HER2 inhibitors.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.