Kmeans is one of the most algorithms that are utilized in data clustering. Number of metrics is coupled with kmeans in order cluster data targeting the enhancement of both locally clusters compactness and the globally clusters separation. Then, before the ultimate data assignment to their corresponding clusters, the selection of the optimal number of clusters should constitute a crucial step in the clustering process. The present work aims to build up a new clustering metric/heuristic that takes into account both space dispersion and inferential characteristics of the data to be clustered. Hence, in this paper, a Geometry-Inference based Clustering (GIC) heuristic is proposed for selecting the optimal numbers of clusters. The conceptual approach proposes the “Initial speed rate” as the main geometric parameter to be inferentially studied. After, the corresponding histograms are fitted by means of classical distributions. A clear linear behaviour regarding the distributions’ parameters was detected according to the number of optimal clusters k* for each of the 14 datasets adopted in this work. Finally, for each dataset, the optimal k* is observed to match with the change-points assigned as the intersection of two clearly salient lines. All fittings are tested using Khi2 tests showing excellent fitting in terms of p-values, and R² also for linear fittings. Then, a change-point algorithm is launched to select k*. To sum up, the GIC heuristic shows a full quantitative aspect, and is fully automated; no qualitative index or graphical techniques are used herein.
Background COVID-19 caused a worldwide outbreak leading the majority of human activities to a rough breakdown. Many stakeholders proposed multiple interventions to slow down the disease and number of papers were devoted to the understanding the pandemic, but to a less extend some were oriented socio-economic analysis. In this paper, a socio-economic analysis is proposed to investigate the early-age effect of socio-economic factors on COVID-19 spread. Methods Fifty-two countries were selected for this study. A cascade algorithm was developed to extract the R0 number and the day J*; these latter should decrease as the pandemic flattens. Subsequently, R0 and J* were modeled according to socio-economic factors using multilinear stepwise-regression. Results The findings demonstrated that low values of days before lockdown should flatten the pandemic by reducing J*. Hopefully, DBLD is only parameter to be tuned in the short-term; the other socio-economic parameters cannot easily be handled as they are annually updated. Furthermore, it was highlighted that the elderly is also a major influencing factor especially because it is involved in the interactions terms in R0 model. Simulations proved that the health care system could improve the pandemic damping for low elderly. In contrast, above a given elderly, the reproduction number R0 cannot be reduced even for developed countries (showing high HCI values), meaning that the disease’s severity cannot be smoothed regardless the performance of the corresponding health care system; non-pharmaceutical interventions are then expected to be more efficient than corrective measures. Discussion The relationship between the socio-economic factors and the pandemic parameters R0 and J* exhibits complex relations compared to the models that are proposed in the literature. The quadratic regression model proposed here has discriminated the most influencing parameters within the following approximated order, DLBL, HCI, Elderly, Tav, CO2, and WC as first order, interaction, and second order terms. Conclusions This modeling allowed the emergence of interaction terms that don’t appear in similar studies; this led to emphasize more complex relationship between the infection spread and the socio-economic factors. Future works will focus on enriching the datasets and the optimization of the controlled parameters to short-term slowdown of similar pandemics.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.