2022
DOI: 10.3390/math10081283
|View full text |Cite
|
Sign up to set email alerts
|

Mitigating the Multicollinearity Problem and Its Machine Learning Approach: A Review

Abstract: Technologies have driven big data collection across many fields, such as genomics and business intelligence. This results in a significant increase in variables and data points (observations) collected and stored. Although this presents opportunities to better model the relationship between predictors and the response variables, this also causes serious problems during data analysis, one of which is the multicollinearity problem. The two main approaches used to mitigate multicollinearity are variable selection… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
82
0
1

Year Published

2022
2022
2024
2024

Publication Types

Select...
8
1

Relationship

1
8

Authors

Journals

citations
Cited by 206 publications
(84 citation statements)
references
References 64 publications
1
82
0
1
Order By: Relevance
“…The presence of multicollinearity can be assessed for example by the variance inflation factor 104 or the weighted variance inflation factor, 105 and the problem can be minimized by combining correlated variables to a single variable or by employing regression algorithms that can deal with multicollinearity (e.g., Ridge regression and Lasso regression). 106 In addition, it is necessary to carefully design the GLM regression matrix given the SPA-fNIRS data to avoid underfitting and overfitting. Cross-validated Bayesian model selection is a solution for this issue.…”
Section: Systemic Physiology Augmented Functional Near-infrared Spect...mentioning
confidence: 99%
“…The presence of multicollinearity can be assessed for example by the variance inflation factor 104 or the weighted variance inflation factor, 105 and the problem can be minimized by combining correlated variables to a single variable or by employing regression algorithms that can deal with multicollinearity (e.g., Ridge regression and Lasso regression). 106 In addition, it is necessary to carefully design the GLM regression matrix given the SPA-fNIRS data to avoid underfitting and overfitting. Cross-validated Bayesian model selection is a solution for this issue.…”
Section: Systemic Physiology Augmented Functional Near-infrared Spect...mentioning
confidence: 99%
“…Although this creates opportunities to improve model performance, it also creates serious problems during the data analysis process. One of these problems is known as the multicollinearity problem [69]. Multicollinearity is a problem that arises when the input features of a dataset have a strong correlation with more than one of the other features in the dataset.…”
Section: Dimensional Reduction Using Lr-pcamentioning
confidence: 99%
“…Financial time series usually come with a high degree of variability and extensive noise to disrupt the forecasting performance [11,12]. The study [13] claimed that the financial market might have a complex correlation across different markets, companies, or countries. Based on this, later studies attempted to address the forecasting problem by modelling the correlation between the variables of different market [14].…”
Section: Introductionmentioning
confidence: 99%