Drug-induced liver injury (DILI) is a major factor in the development of drugs and the safety of drugs. If the DILI cannot be effectively predicted during the development of the drug, it will cause the drug to be withdrawn from markets. Therefore, DILI is crucial at the early stages of drug research. This work presents a 2-class ensemble classifier model for predicting DILI, with 2D molecular descriptors and fingerprints on a dataset of 450 compounds. The purpose of our study is to investigate which are the key molecular fingerprints that may cause DILI risk, and then to obtain a reliable ensemble model to predict DILI risk with these key factors. Experimental results suggested that 8 molecular fingerprints are very critical for predicting DILI, and also obtained the best ratio of molecular fingerprints to molecular descriptors. The result of the 5-fold cross-validation of the ensemble vote classifier method obtain an accuracy of 77.25%, and the accuracy of the test set was 81.67%. This model could be used for drug‐induced liver injury prediction.
BackgroundThe occurrence of cotton pests and diseases has always been an important factor affecting the total cotton production. Cotton has a great dependence on environmental factors during its growth, especially climate change. In recent years, machine learning and especially deep learning methods have been widely used in many fields and have achieved good results.MethodsFirst, this papaer used the common Aprioro algorithm to find the association rules between weather factors and the occurrence of cotton pests. Then, in this paper, the problem of predicting the occurrence of pests and diseases is formulated as time series prediction, and an LSTM-based method was developed to solve the problem.ResultsThe association analysis reveals that moderate temperature, humid air, low wind spreed and rain fall in autumn and winter are more likely to occur cotton pests and diseases. The discovery was then used to predict the occurrence of pests and diseases. Experimental results showed that LSTM performs well on the prediction of occurrence of pests and diseases in cotton fields, and yields the Area Under the Curve (AUC) of 0.97.ConclusionSuitable temperature, humidity, low rainfall, low wind speed, suitable sunshine time and low evaporation are more likely to cause cotton pests and diseases. Based on these associations as well as historical weather and pest records, LSTM network is a good predictor for future pest and disease occurrences. Moreover, compared to the traditional machine learning models (i.e., SVM and Random Forest), the LSTM network performs the best.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.