Stacking ensemble with parsimonious base models to improve generalization capability in the characterization of steel bolted components

Pernía‐Espinoza, Alpha; Ceniceros, Julio Fernández; Antonanzas, J.; Urraca, Rubén; Martínez-de-Pisón, Francisco Javier

doi:10.1016/j.asoc.2018.06.005

Cited by 36 publications

(16 citation statements)

References 37 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Murphree et al applied a stacking mechanism to predict the likelihood of adverse reactions induced by blood transfusions [49]. Pernia-Espinoza et al applied stacking to predict the three key points of a comprehensive force-displacement curve for bolted joints in steel structures [50]. Stacking provides a natural and effective method of combining various (often conflicting) findings from independent research activities.…”

Section: Discussionmentioning

confidence: 99%

A novel model for malaria prediction based on ensemble algorithms

Wang

et al. 2019

PLoS ONE

View full text Add to dashboard Cite

Background and objectiveMost previous studies adopted single traditional time series models to predict incidences of malaria. A single model cannot effectively capture all the properties of the data structure. However, a stacking architecture can solve this problem by combining distinct algorithms and models. This study compares the performance of traditional time series models and deep learning algorithms in malaria case prediction and explores the application value of stacking methods in the field of infectious disease prediction.MethodsThe ARIMA, STL+ARIMA, BP-ANN and LSTM network models were separately applied in simulations using malaria data and meteorological data in Yunnan Province from 2011 to 2017. We compared the predictive performance of each model through evaluation measures: RMSE, MASE, MAD. In addition, gradient-boosting regression trees (GBRTs) were used to combine the above four models. We also determined whether stacking structure improved the model prediction performance.ResultsThe root mean square errors (RMSEs) of the four sub-models were 13.176, 14.543, 9.571 and 7.208; the mean absolute scaled errors (MASEs) were 0.469, 0.472, 0.296 and 0.266 and the mean absolute deviation (MAD) were 6.403, 7.658, 5.871 and 5.691. After using the stacking architecture combined with the above four models, the RMSE, MASE and MAD values of the ensemble model decreased to 6.810, 0.224 and 4.625, respectively.ConclusionsA novel ensemble model based on the robustness of structured prediction and model combination through stacking was developed. The findings suggest that the predictive performance of the final model is superior to that of the other four sub-models, indicating that stacking architecture may have significant implications in infectious disease prediction.

show abstract

Section: Discussionmentioning

confidence: 99%

A novel model for malaria prediction based on ensemble algorithms

Wang

et al. 2019

PLoS ONE

View full text Add to dashboard Cite

show abstract

“…To select the best prediction model to patients' no-show in future CT appointments we applied the principle of parsimony (i.e. simpler models should be chosen over more complex ones), since the both models displayed similar performance [20]. The eight predictors of no-show to CT exam appointments selected by the penalized logistic regression model are: race, marital status, month, number of no-shows to exams and consultations in the previous year, distance, lead-time and number of exams scheduled in previous year.…”

Section: Resultsmentioning

confidence: 99%

Modeling the No-Show of Patients to Exam Appointments of Computed Tomography

Silva

Fogliatto

Garcia

et al. 2020

Preprint

View full text Add to dashboard Cite

Background: No-shows of patients have negative impacts on healthcare systems, such as resources’ underutilization, efficiency loss, and cost increase. Predicting no-show is key to develop strategies that counteract its effects. In this paper, we propose a model to predict the no-show of ambulatory patients to exam appointments of computed tomography at the Radiology department of a large Brazilian public hospital.Methods: We carried out a retrospective study on 8,382 appointments made to computed tomography (CT) exams between January and December 2017. Penalized logistic regression and multivariate logistic regression were used to model the influence of 15 candidate variables on patients’ no-show. The predictive capabilities of the models were evaluated analyzing the Area Under the Curve (AUC) of the Receiver Operating Characteristic (ROC).Results: The no-show rate in computerized tomography exams appointments was 6.65%. The two models performed similarly in terms of AUC. The penalized logistic regression model was selected using the parsimony criterion, with 8 of the 15 variables analyzed appearing as significant. One of the variables included in the model (number of exams scheduled in previous year) had not been previously reported in the related literature.Conclusions: Our findings may be used to guide the development of strategies to reduce the no-show of patients to exam appointments.

show abstract

“…In the literature on machine learning algorithms, there is an agreement that, among several accurate predictive models, the one with the least complexity should be selected. Such a model will probably be more robust for new data (Pernia-Espinoza et al, 2018). The complexity of a model can be assessed from different perspectives, including the model's internal structure (Seni and Elder, 2010), the degrees of freedom (Ye, 1998), the Vapnik-Chervonenkis dimension (Vapnik and Chervonenkis, 2015), or the number of features selected for model construction (Pernia-Espinoza et al, 2018).…”

Section: Review Of Feature Selection Methodsmentioning

confidence: 99%

“…It is widely accepted that feeding a model with too many features not only negatively impacts the computational time, but also compromises its generalization capability. A generalization capability of a predictive model is its ability to predict the response of unseen data (Kohavi and John, 1997;Pernia-Espinoza et al, 2018). Moreover, thanks to Electronic Health Records (EHR) data, an unprecedented source of information has become available to data science researchers, which can help them to discover new influential features (Gallego et al, 2013).…”

Section: Review Of Patient No-show Research Papersmentioning

confidence: 99%

“…This stage incorporates an approach referred to as "stacking" or the "meta-ensembling" method. Stacking is a well-known technique that uses the output of multiple predictive models as the basis for the training of another model (Pernia-Espinoza et al, 2018). A stacking model performs better than each base model when high diversity exists between the base predictive models (Dai et al, 2017).…”

Section: Second Stagementioning

confidence: 99%

See 1 more Smart Citation

A metaheuristic-based stacking model for predicting the risk of patient no-show and late cancellation for neurology appointments

Ahmadi

Garcia-Arce

Masel

et al. 2019

IISE Transactions on Healthcare Systems Engineering

View full text Add to dashboard Cite

Patient no-shows and late cancellations for an appointment are common problems in healthcare, which adversely affect the financial performance and quality of service of healthcare organizations. A high rate of patient no-show and late cancellation in a clinic can significantly limit access to healthcare. In general, hospitals create predictive models to assess risk of no-show, and then assign overbooking appointments utilizing those risks. In this paper, by incorporating machine learning and optimization techniques, we proposed a predictive model to assist with the overbooking decision. The model consists of two phases. First, we utilized a metaheuristic optimization technique to explore the best subset of featuresknown as feature selection problemthat can significantly contribute to the prediction outcomes. Second, using the output of the first stage, we proposed a stacking model to improve the prediction performances further. Our extensive computations and comparisons across different classifiers show that formulating the feature selection problem as a multi-objective problem instead of a single-objective problem using random forest classifier yields better results. The proposed model will improve the overbooking at clinics, by increasing the patient access to care. We introduced important new features to the literature that can describe the no-show and late cancellation behavior.

show abstract

Stacking ensemble with parsimonious base models to improve generalization capability in the characterization of steel bolted components

Cited by 36 publications

References 37 publications

A novel model for malaria prediction based on ensemble algorithms

A novel model for malaria prediction based on ensemble algorithms

Modeling the No-Show of Patients to Exam Appointments of Computed Tomography

A metaheuristic-based stacking model for predicting the risk of patient no-show and late cancellation for neurology appointments

Contact Info

Product

Resources

About