Proceedings of the 2018 International Conference on Machine Learning and Machine Intelligence 2018
DOI: 10.1145/3278312.3278316
|View full text |Cite
|
Sign up to set email alerts
|

Mutual Information-based Feature Selection Approach to Reduce High Dimension of Big Data

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
12
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
7

Relationship

0
7

Authors

Journals

citations
Cited by 10 publications
(12 citation statements)
references
References 11 publications
0
12
0
Order By: Relevance
“…As a rule of thumb, more features on hand provide more information and hence should provide better classification accuracy [ 10 ]. However, in many instances, if we have more number of features in a dataset, and we use this data to train a classification model, the model gets confused while learning all the features on data, which results in decreasing classification accuracy instead of increasing it [ 10 ] [ 11 ]. To add further, the computations required by any classification system for high-dimensional data is a very expensive task in terms of time and memory [ 11 ].…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations
“…As a rule of thumb, more features on hand provide more information and hence should provide better classification accuracy [ 10 ]. However, in many instances, if we have more number of features in a dataset, and we use this data to train a classification model, the model gets confused while learning all the features on data, which results in decreasing classification accuracy instead of increasing it [ 10 ] [ 11 ]. To add further, the computations required by any classification system for high-dimensional data is a very expensive task in terms of time and memory [ 11 ].…”
Section: Introductionmentioning
confidence: 99%
“…However, in many instances, if we have more number of features in a dataset, and we use this data to train a classification model, the model gets confused while learning all the features on data, which results in decreasing classification accuracy instead of increasing it [ 10 ] [ 11 ]. To add further, the computations required by any classification system for high-dimensional data is a very expensive task in terms of time and memory [ 11 ]. Therefore, a technique called feature selection is leveraged in order to select relevant features from the available set of features in high-dimensional datasets [ 11 ] [ 12 ].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Information Gain (IG): IG is one of the FS methods used in text classification, which utilizes a global filter-based approach [37]. IG is a method for evaluating entropybased features [38] and is widely used in statistics and machine learning [21]. The higher the entropy is, the more information about the feature is obtained [37].…”
Section: -Proposed Methodsmentioning
confidence: 99%
“…For instance, it is used in IDSs. There are three techniques for selecting features [5][6][7]: wrapper [8], filter [9] and embedded techniques [10]. Through the embedded technique, the feature selection for a given learning algorithm is integrated into the training process.…”
Section: Introductionmentioning
confidence: 99%