Cardiovascular disease is one of the most serious diseases that threaten human health in the world today. Therefore, establishing a high-quality disease prediction model is of great significance for the prevention and treatment of cardiovascular disease. In the feature selection stage, three new strong feature vectors are constructed based on the background of disease prediction and added to the original data set, and the relationship between the feature vectors is analyzed by using the correlation coefficient map. At the same time, a random forest algorithm is introduced for feature selection, and the importance ranking of features is obtained. In order to further improve the prediction effect of the model, a cardiovascular disease prediction model based on R-Lookahead-LSTM is proposed. The model based on the stochastic gradient descent algorithm of the fast weight part of the Lookahead algorithm is optimized and improved to the Rectified Adam algorithm; the Tanh activation function is further improved to the Softsign activation function to promote model convergence; and the R-Lookahead algorithm is used to further optimize the long-term memory network model. Therefore, the long- and short-term memory network model can be better improved so that the model tends to be stable as soon as possible, and it is applied to the cardiovascular disease prediction model.
Aiming at the problem of high-dimensional features and data imbalance in credit risk assessment that affect model prediction results, a credit risk prediction method based on integrated learning is proposed from three levels of features, data and algorithms. First, a hybrid filter and Random Forest feature selection method is used to select features. This method uses the improved Relief algorithm to initially select features, then combines the maximum information coefficient to eliminate redundant features, and uses the Random Forest algorithm to further reduce the feature dimension. Second, on the basis of the Borderline-SMOTE method, an adaptive idea is introduced to generate a different number of new samples for each minority sample at the boundary, and a new interpolation method is used to make the new sample distribution more reasonable, so as to reduce the sample imbalance. Finally, the Focal Loss is used to improve the loss function of LightGBM, and the sample weight is adjusted through the parameters α and γ in the Focal Loss function so that the model pays more attention to minority samples and difficult-to-classify samples, and improves the accuracy of model classification. And use the improved algorithm as the base classifier and then use AdaBoost and random subspace methods to integrate to establish a credit risk prediction model. Through comparative experiments with other methods, the results show that this method effectively improves the G -mean value and the AUC value and has a better default prediction effect.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.