Proceedings of the 2010 ACM Symposium on Applied Computing 2010
DOI: 10.1145/1774088.1774461
|View full text |Cite
|
Sign up to set email alerts
|

Feature selection for ordinal regression

Abstract: Ordinal regression (also known as ordinal classification) is a supervised learning task that consists of automatically determining the implied rating of a data item on a fixed, discrete rating scale. This problem is receiving increasing attention from the sentiment analysis and opinion mining community, due to the importance of automatically rating increasing amounts of product review data in digital form. As in other supervised learning tasks such as (binary or multiclass) classification, feature selection is… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
140
1
3

Year Published

2011
2011
2023
2023

Publication Types

Select...
4
3
1

Relationship

1
7

Authors

Journals

citations
Cited by 108 publications
(144 citation statements)
references
References 9 publications
0
140
1
3
Order By: Relevance
“…For instance, in the training set of TripAdvisor-15763, the smaller of the two datasets discussed in Section 4, there are 38,447 unique words and 171,894 unique LM-features; using them all would degrade accuracy (due to overfitting) and efficiency (at both training time and classification time). For Phase 2 StarTrack relies on RR(N C * IDF ), a feature selection technique for ordinal regression that we have proposed in 5) , and that in previous experimentation has given consistently good results. * 11 RR(N C * IDF ) attributes a score to each feature, after which only the highest-scoring features are retained.…”
Section: The Internals Of Startrack: Learning and Feature Selectionmentioning
confidence: 99%
See 2 more Smart Citations
“…For instance, in the training set of TripAdvisor-15763, the smaller of the two datasets discussed in Section 4, there are 38,447 unique words and 171,894 unique LM-features; using them all would degrade accuracy (due to overfitting) and efficiency (at both training time and classification time). For Phase 2 StarTrack relies on RR(N C * IDF ), a feature selection technique for ordinal regression that we have proposed in 5) , and that in previous experimentation has given consistently good results. * 11 RR(N C * IDF ) attributes a score to each feature, after which only the highest-scoring features are retained.…”
Section: The Internals Of Startrack: Learning and Feature Selectionmentioning
confidence: 99%
“…For this purpose, we have defined a module (based on part-of-speech tagging and a simple grammar of phrases -see 4) for details) that (a) extracts complex phrases, such as hotel(NN) was(Be) very(RB) nice(JJ) * 11 The name "RR(N C * IDF )" stands for round robin on negative correlation times inverse document frequency, and refers to the fact that the technique consists in computing, for each feature, a score resulting from its inverse document frequency and its negative correlation with a given rating, and then choosing the features according to a policy that "round-robins" across the ratings. The interested reader can check 5) for details.…”
Section: The Internals Of Startrack: Sentiment-based Feature Extractionmentioning
confidence: 99%
See 1 more Smart Citation
“…In the first one, we use SentiWordNet 3.0 (Baccianella et al, 2010) to obtain the sentiment scores of each word. We used the word and a tag representing the POS tag of the word to output the sentiment score of the word.…”
Section: Features Based On Sentiment Scoresmentioning
confidence: 99%
“…So far, some feature evaluation algorithms have been developed for monotonic classification 31,32,33,34 . Dominance-based rough set approach(DRSA) was firstly introduced by Greco, Matarazzo and Slowinski, where classical indiscernibility relations were replaced with dominance relations 1,35 .…”
Section: Introductionmentioning
confidence: 99%