Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017) 2017
DOI: 10.18653/v1/s17-2105
|View full text |Cite
|
Sign up to set email alerts
|

HLP$@$UPenn at SemEval-2017 Task 4A: A simple, self-optimizing text classification system combining dense and sparse vectors

Abstract: We present a simple supervised text classification system that combines sparse and dense vector representations of words, and the generalized representations of words via clusters. The sparse vectors are generated from word n-gram sequences (1-3). The dense vector representations of words (embeddings) are learned by training a neural network to predict neighboring words in a large unlabeled dataset. To classify a text segment, the different vector representations of it are concatenated, and the classification … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
7
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 11 publications
(7 citation statements)
references
References 14 publications
0
7
0
Order By: Relevance
“…For all three tasks, extracted concepts could be matched exactly to the forum posts, thus negating the potential benefit of normalization. The exact matching can perhaps be explained by the fact that data collection and extraction from noisy text sources such as social media typically rely on keyword-based searching [54].…”
Section: Discussionmentioning
confidence: 99%
“…For all three tasks, extracted concepts could be matched exactly to the forum posts, thus negating the potential benefit of normalization. The exact matching can perhaps be explained by the fact that data collection and extraction from noisy text sources such as social media typically rely on keyword-based searching [54].…”
Section: Discussionmentioning
confidence: 99%
“…unigrams, bigrams or trigram, respectively. Sarker et al [15] and Pal and Gosh [11] used n-gram features for developing sentiment analysis methods and evaluated their methods against the same datasets that we use in this work. Here, we explore the fol-Table 2.…”
Section: Methodsmentioning
confidence: 99%
“…For all three tasks, extracted concepts could be matched exactly to the forum posts, thus negating the potential benefit of normalization. The exact matching can perhaps be explained by the fact that data collection and extraction from noisy text sources such as social media typically rely on keyword-based searching (Sarker and Gonzalez, 2017b).…”
Section: F1mentioning
confidence: 99%