2020
DOI: 10.1109/access.2020.3042119
|View full text |Cite
|
Sign up to set email alerts
|

CBRG: A Novel Algorithm for Handling Missing Data Using Bayesian Ridge Regression and Feature Selection Based on Gain Ratio

Abstract: Existing imputation methods may lead to biased predictions and decrease or increase the statistical influence which leads to improper estimations. Several missing value imputation approaches performance depends on the size of the dataset and the number of missing values within the dataset. In this work, the authors proposed a novel algorithm for manipulating missing data versus some common imputation approaches. The proposed algorithm imputes missing values in cumulative order depending on the gain ratio (GR) … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
14
0
1

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
4

Relationship

0
8

Authors

Journals

citations
Cited by 22 publications
(15 citation statements)
references
References 37 publications
0
14
0
1
Order By: Relevance
“…But such a solution may result in bad analysis, losing the ability to recognize statistically significant variations and may also generates bias. Missingness mechanisms have a large effect on FS that's why before applying any FS technique missingness mechanisms need to be taken into consideration [3].…”
Section: Feature Selectionmentioning
confidence: 99%
See 2 more Smart Citations
“…But such a solution may result in bad analysis, losing the ability to recognize statistically significant variations and may also generates bias. Missingness mechanisms have a large effect on FS that's why before applying any FS technique missingness mechanisms need to be taken into consideration [3].…”
Section: Feature Selectionmentioning
confidence: 99%
“…In single imputation, MVs are imputed by a value one time. Though, single imputation does not require computational resources it can result in biased results [3]. In multiple imputation, m copies from the original dataset are generated.…”
Section: Handling Missing Datamentioning
confidence: 99%
See 1 more Smart Citation
“…These widely used FS methods are classified into four categories: (1) Filter-based methods evaluate features depending on their inherent properties, such as the statistical properties of the data. The most popular filter methods are chisquare [12], the gain ratio [13], information gain [14], ReliefF [15] and minimum Redundancy Maximum Relevance (mRMR) [16]. The main advantages of filter methods are that they are not dependent on classifiers and are fast and straightforward in terms of computation.…”
Section: Introductionmentioning
confidence: 99%
“…Academics have relied on a variety of ad hoc methods to "repair" data for decades, such as eliminating incomplete instances or replacing missing values. A few missing worth attribution approaches execution relies upon the size of the dataset and the quantity of missing qualities inside the dataset [7]. Unfortunately, the bulk of these solutions are prone to severe bias since they rely on a relatively rigid assumption about the cause of missing data.…”
Section: Introductionmentioning
confidence: 99%