2009 Fifth International Conference on Soft Computing, Computing With Words and Perceptions in System Analysis, Decision and Co 2009
DOI: 10.1109/icsccw.2009.5379438
|View full text |Cite
|
Sign up to set email alerts
|

A symmetric term weighting scheme for text categorization based on term occurrence probabilities

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
6
0

Year Published

2010
2010
2016
2016

Publication Types

Select...
4
1

Relationship

1
4

Authors

Journals

citations
Cited by 5 publications
(6 citation statements)
references
References 8 publications
0
6
0
Order By: Relevance
“…From this perspective, the gain by the use of a function of together with RF can be considered as small. However, it should be noted that, when the term frequency (tf ) is used alone without any other weight, the Macro-F 1 score is obtained as 88.39 on Reuters-21578 dataset [23]. In other words, the gain provided by the state-of-the-art scheme, tf × RF is 1.07.…”
Section: Discussionmentioning
confidence: 98%
See 1 more Smart Citation
“…From this perspective, the gain by the use of a function of together with RF can be considered as small. However, it should be noted that, when the term frequency (tf ) is used alone without any other weight, the Macro-F 1 score is obtained as 88.39 on Reuters-21578 dataset [23]. In other words, the gain provided by the state-of-the-art scheme, tf × RF is 1.07.…”
Section: Discussionmentioning
confidence: 98%
“…In the case of RF χ 2 , it can be seen that a better F 1 score is achieved only on WebKB whereas the score on Reuters-21578 is inferior compared to χ 2 . It should be remembered that RF is an asymmetric scheme where, for a given value, the weights decrease as β increases [23]. On the other hand, χ 2 is a symmetric scheme where the weights decrease as β increases from zero to one and increases for increasing values of β when β > 1.…”
Section: Experimental Workmentioning
confidence: 98%
“…The authors proposed three new TWS for QC and analysed their behaviour for TC. In [57], authors proposed that TWS can also be considered as term occurrence probabilities in +ve and 2ve categories.…”
Section: Related Workmentioning
confidence: 99%
“…Ignoring the priori class probability ratio, the classification task can be achieved by the matching score function, 19 (5) where N u is the term frequency of w u in the document, 1u is the Multinomial parameter of the positive class, and 0u is the Multinomial parameter of the negative one. After replacing the parameters by their Bayesian estimators, we have a new scheme as…”
Section: Relevance Term Frequency Weighing Schemementioning
confidence: 99%
“…It helps to adjust for the fact that some words appear more frequently in general and is known as the global weight (or, the collection frequency factor). [4][5][6] However, tf-idf is not the best choice for a supervised learning task, in which the labels of the training documents contain important information. 4 Using these information, supervised term weighting methods can assign larger weights to the discriminative terms among different classes, and have gained increasing attention since the beginning of the new century.…”
Section: Introductionmentioning
confidence: 99%