2009
DOI: 10.1016/j.eswa.2008.06.054
|View full text |Cite
|
Sign up to set email alerts
|

Feature selection for text classification with Naïve Bayes

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

1
212
0
12

Year Published

2011
2011
2022
2022

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 540 publications
(225 citation statements)
references
References 7 publications
1
212
0
12
Order By: Relevance
“…One well-known problem of the transformation method in this approach is Binary Relevance (BR), which separates each label independently [34]. After separating the label set, a score function that is able to measure the importance of features and labels, such as the Pearson correlation coefficient (BR + CC) [37] or odds ratio (BR + OR) [38], can be employed. Because the final feature score is obtained by aggregating all of the importance values of (feature, label) pairs, it requires a prohibitive computational cost if a large label set is involved.…”
Section: A Brief Review Of Multi-label Feature Selectionmentioning
confidence: 99%
See 1 more Smart Citation
“…One well-known problem of the transformation method in this approach is Binary Relevance (BR), which separates each label independently [34]. After separating the label set, a score function that is able to measure the importance of features and labels, such as the Pearson correlation coefficient (BR + CC) [37] or odds ratio (BR + OR) [38], can be employed. Because the final feature score is obtained by aggregating all of the importance values of (feature, label) pairs, it requires a prohibitive computational cost if a large label set is involved.…”
Section: A Brief Review Of Multi-label Feature Selectionmentioning
confidence: 99%
“…Thus, a smaller value for the label density indicates a higher sparsity for the given label set. To test the performance of the proposed method from the viewpoint of computational efficiency, we choose five multi-label feature selection methods: BR + CC [37], BR + OR [38], ELA + CHI [25], FIMF [17] and MFS [28]. Two multi-label feature selection methods-BR + CC and BR + OR-perform the feature selection process based on the binary relevance-based problem transformation strategy.…”
Section: Datasets and Experimental Settingsmentioning
confidence: 99%
“…Penelitian yang dilakukan dengan menggunakan SVM merupakan metode yang sering digunakan oleh para peneliti dalam text mining [4]--[6]. Salah satu di antara metode terbaik yang ada, NB, adalah metode yang populer dalam klasifikasi teks, yaitu sebagai salah satu metode komputasi yang efisien dan juga mempunyai performance predictive yang baik [7].Penentuan klasifikasi kategori cerpen dilakukan dengan menggunakan text mining sebagai metode yang memungkinkannya. Salah satu di antaranya adalah NB.…”
unclassified
“…Hal ini sering terjadi pada teks yang memiliki puluhan ribu fitur. Sebagian besar fitur tersebut tidak relevan dan tidak bermanfaat bagi klasifikasi teks, bahkan dapat mengurangi tingkat akurasi (accuracy) [7]. Pada umumnya, atribut dari klasifikasi teks sangat besar dan jika semua atribut tersebut digunakan, maka akan mengurangi kinerja dari classifier serta untuk mendapatkan akurasi yang lebih baik, atribut yang ada harus dipilih dengan algoritme yang tepat [9], [10].…”
unclassified
“…These are categorized into two main categories: filter approach and wrapper approach. Filter approach is responsible for selecting the features based upon general characteristic of data [7]. It doesn't include learning algorithm to evaluate the importance of feature.…”
Section: Introductionmentioning
confidence: 99%