Machine Learning-Based Identification of the Strongest Predictive Variables of Winning and Losing in Belgian Professional Soccer

Geurkink, Youri; Boone, Jan; Verstockt, Steven; Bourgois, Jan

doi:10.3390/app11052378

Cited by 31 publications

(27 citation statements)

References 48 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…To prepare the input features for analysis, categorical variables were converted into dummy variables using OneHotEncoder, a scikit-learn (version 1.0.1) preprocessing package in Python. To avoid collinearity effect between the input variables, a Variance Inflation Factor analysis was conducted (threshold = 5) [ 26 ].…”

Section: Methodsmentioning

confidence: 99%

Identifying Modifiable Predictors of COVID-19 Vaccine Side Effects: A Machine Learning Approach

et al. 2022

View full text Add to dashboard Cite

Side effects of COVID-19 or other vaccinations may affect an individual’s safety, ability to work or care for self or others, and/or willingness to be vaccinated. Identifying modifiable factors that influence these side effects may increase the number of people vaccinated. In this observational study, data were from individuals who received an mRNA COVID-19 vaccine between December 2020 and April 2021 and responded to at least one post-vaccination symptoms survey that was sent daily for three days after each vaccination. We excluded those with a COVID-19 diagnosis or positive SARS-CoV2 test within one week after their vaccination because of the overlap of symptoms. We used machine learning techniques to analyze the data after the first vaccination. Data from 50,484 individuals (73% female, 18 to 95 years old) were included in the primary analysis. Demographics, history of an epinephrine autoinjector prescription, allergy history category (e.g., food, vaccine, medication, insect sting, seasonal), prior COVID-19 diagnosis or positive test, and vaccine manufacturer were identified as factors associated with allergic and non-allergic side effects; vaccination time 6:00–10:59 was associated with more non-allergic side effects. Randomized controlled trials should be conducted to quantify the relative effect of modifiable factors, such as time of vaccination.

show abstract

Section: Methodsmentioning

confidence: 99%

Identifying Modifiable Predictors of COVID-19 Vaccine Side Effects: A Machine Learning Approach

et al. 2022

View full text Add to dashboard Cite

show abstract

“…Feature selection approaches are essential components of the model designing phase to achieve the optimum performance of a forecast model. The Python-based BorutaShap algorithm remarkably eliminates irrelevant and largely redundant features, as revealed in a study where it was employed in identifying the strongest data series of winning and losing the Belgian professional soccer [22]. Along with utilizing selective filtering, the application of robust data decomposition schemes such as SWT efficiently accomplishes dimensionality reduction of the input variables.…”

Section: Related Workmentioning

confidence: 99%

“…It is highly compatible and facilitates any tree-based learner such as RF, XGBoost, decision tree (DT), etc. as the base model [22], [32]. To select the most significant features, the Boruta algorithm creates shadow features (exact replicas) of each feature and shuffles the values in the shadowed features to remove their correlations with the response variable [33].…”

Section: B Wrapper-based Borutashapmentioning

confidence: 99%

Cloud Affected Solar UV Prediction With Three-Phase Wavelet Hybrid Convolutional Long Short-Term Memory Network Multi-Step Forecast System

et al. 2022

View full text Add to dashboard Cite

Harmful exposure to erythemally-effective ultraviolet radiation (UVR) poses high health risks such as malignant keratinocyte cancers and eye-related diseases. Delivering short-term forecasts of the solar ultraviolet index (UVI) is an effective way to advise UVR exposure information to the public at risk. This research reports on a novel framework built to forecast UVI, integrating antecedent lagged memory of cloud statistical properties and the solar zenith angle (SZA). To produce the forecasts at multi-step horizon we design a 3-phase hybrid convolutional long short-term memory network (W-O-convLSTM) model, validated with Queensland-based datasets. Our approach in optimizing the performance entails a robust selective filtering method using the BorutaShap algorithm, data decomposition with stationary wavelet transformation and hyperparameter optimization using the Optuna algorithm. We assess the performance of the proposed W-O-convLSTM model alongside the baseline and benchmark models. The captured results, through statistical metrics and visual infographics, elucidate the superior performance of the objective model in shortterm UVI forecasting. For instance, at a 10-minute forecast horizon, our objective model yields a relatively high correlation coefficient of ~0.961 in the autumn, 0.909 in the summer, 0.926 in the spring and 0.936 in the winter season. Overall, the proposed O-convLSTM model outperforms its competing counterpart models for all forecast horizons with the lowest absolute forecast error. The robustness of our newly proposed model avers its practical utility in delivering accurate sun-protection behavior recommendations to mitigate UVexposure-related public health risk. In accordance with our findings, we recommend that future integration of aerosol and ozone effects with cloud cover data may further enhance our UVI forecasting framework.INDEX TERMS Ultraviolet index forecasting, cloud effects, convolutional long short-term memory network, stationary wavelet transform.

show abstract

“…Such datasets, i.e., high-dimension low-sample size (HDLSS), are very common in clinical settings [36], [37] and are known to present several statistical challenges [38][39][40][41]. Machine learning (ML) techniques offer several tools to handle these challenges and have been used extensively for high-dimensionality problems [42][43][44][45][46][47]. When the sample size is small, feature selection is a crucial data preprocessing step that allows choosing the variables that contribute the most to the target effect while minimizing possible redundancies.…”

Section: Introductionmentioning

confidence: 99%

Identification of Statin’s Action in a Small Cohort of Patients with Major Depression

et al. 2021

View full text Add to dashboard Cite

Statins are widely used as an effective therapy for ischemic vascular disorders and employed for primary and secondary prevention in cardiac and cerebrovascular diseases. Their hemostatic mechanism has also been shown to induce changes in cerebral blood flow that may result in neurocognitive improvement in subjects with Major Depressive Disorder. Behavioral data, various blood tests, and resting-state brain perfusion data were obtained at the start of this study and three months post-therapy from a small cohort of participants diagnosed with Major Depressive Disorder. Subjects received either rosuvastatin (10 mg) or placebo with their standard selective serotonin reuptake inhibitors therapy. At the end of the study, patients using rosuvastatin reported more positive mood changes than placebo users. However, standard statistical tests revealed no significant differences in any non-behavioral variables before and after the study. In contrast, feature selection techniques allowed identifying a small set of variables that may be affected by statin use and contribute to mood improvement. Classification models built to assess the distinguishability between the two groups showed an accuracy higher than 85% using only five selected features: two peripheral platelet activation markers, perfusion abnormality in the left inferior temporal gyrus, Attention Switching Task Reaction latency, and serum phosphorus levels. Thus, using machine learning tools, we could identify factors that may be causing self-reported mood improvement in patients due to statin use, possibly suggesting a regulatory role of statins in the pathogenesis of clinical depression.

show abstract

Machine Learning-Based Identification of the Strongest Predictive Variables of Winning and Losing in Belgian Professional Soccer

Cited by 31 publications

References 48 publications

Identifying Modifiable Predictors of COVID-19 Vaccine Side Effects: A Machine Learning Approach

Identifying Modifiable Predictors of COVID-19 Vaccine Side Effects: A Machine Learning Approach

Cloud Affected Solar UV Prediction With Three-Phase Wavelet Hybrid Convolutional Long Short-Term Memory Network Multi-Step Forecast System

Identification of Statin’s Action in a Small Cohort of Patients with Major Depression

Contact Info

Product

Resources

About