Background In the early stages of the COVID-19 pandemic our institution was interested in forecasting how long surgical patients receiving elective procedures would spend in the hospital. Initial examination of our models indicated that, due to the skewed nature of the length of stay, accurate prediction was challenging and we instead opted for a simpler classification model. In this work we perform a deeper examination of predicting in-hospital length of stay. Methods We used electronic health record data on length of stay from 42,209 elective surgeries. We compare different loss-functions (mean squared error, mean absolute error, mean relative error), algorithms (LASSO, Random Forests, multilayer perceptron) and data transformations (log and truncation). We also assess the performance of two stage hybrid classification-regression approach. Results Our results show that while it is possible to accurately predict short length of stays, predicting longer length of stay is extremely challenging. As such, we opt for a two-stage model that first classifies patients into long versus short length of stays and then a second stage that fits a regresssor among those predicted to have a short length of stay. Discussion The results indicate both the challenges and considerations necessary to applying machine-learning methods to skewed outcomes. Conclusions Two-stage models allow those developing clinical decision support tools to explicitly acknowledge where they can and cannot make accurate predictions.
It is of great significance to understand the drivers of PM2.5 and fire carbon emission (FCE) and the relationship between them for the prevention, control, and policy formulation of severe PM2.5 exposure in areas where biomass burning is a major source. In this study, we considered northern Laos as the area of research, and we utilized space cluster analysis to present the spatial pattern of PM2.5 and FCE from 2003–2019. With the use of a random forest and structural equation model, we explored the relationship between PM2.5 and FCE and their drivers. The key results during the target period of the study were as follows: (1) the HH (high/high) clusters of PM2.5 concentration and FCE were very similar and distributed in the west of the study area; (2) compared with the contribution of climate variables, the contribution of FCE to PM2.5 was weak but statistically significant. The standardized coefficients were 0.5 for drought index, 0.32 for diurnal temperature range, and 0.22 for FCE; (3) climate factors are the main drivers of PM2.5 and FCE in northern Laos, among which drought and diurnal temperature range are the most influential factors. We believe that, as the heat intensifies driven by climate in tropical rainforests, this exploration and discovery can help regulators and researchers better integrate drought and diurnal temperature range into FCE and PM2.5 predictive models in order to develop effective measures to prevent and control air pollution in areas affected by biomass combustion.
Establishing an efficient PM2.5 prediction model and in-depth knowledge of the relationship between the predictors and PM2.5 in the model are of great significance for preventing and controlling PM2.5 pollution and policy formulation in the Yangtze River Delta (YRD) where there is serious air pollution. In this study, the spatial pattern of PM2.5 concentration in the YRD during 2003–2019 was analyzed by Hot Spot Analysis. We employed five algorithms to train, verify, and test 17 years of data in the YRD, and we explored the drivers of PM2.5 exposure. Our key results demonstrated: (1) High PM2.5 pollution in the YRD was concentrated in the western and northwestern regions and remained stable for 17 years. Compared to 2003, PM2.5 increased by 10–20% in the southeast, southwest, and western regions in 2019. The hot spot for percentage change of PM2.5 was mostly located in the southwest and southeast regions in 2019, while the interannual change showed a changeable spatial distribution pattern. (2) Geographically Weighted Random Forest (GWRF) has great advantages in predicting the presence of PM2.5 in comparison with other models. GWRF not only improves the performance of RF, but also spatializes the interpretation of variables. (3) Climate and human activities are the most important drivers of PM2.5 concentration. Drought, temperature, and temperature difference are the most critical and potentially threatening climatic factors for the increase and expansion of PM2.5 in the YRD. With the warming and drying trend worldwide, this finding can help policymakers better consider these factors for PM2.5 prediction. Moreover, the effect of interference from humans on ecosystems will increase again after COVID-19, leading to a rise in PM2.5 concentration. The strong explanatory power of comprehensive ecological indicators for the distribution of PM2.5 will be a crucial indicator worthy of consideration by decision-making departments.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.