Proceedings of the First International Workshop on Social Media Retrieval and Analysis 2014
DOI: 10.1145/2632188.2632205
|View full text |Cite
|
Sign up to set email alerts
|

Short text categorization exploiting contextual enrichment and external knowledge

Abstract: We address the problem of the categorization of short texts, like those posted by users on social networks and microblogging platforms. We specifically focus on Twitter. Since short texts do not provide sufficient word occurrences, and they often contain abbreviations and acronyms, traditional classification methods such as "Bag-of-Words" have limitations. Our proposed method enriches the original text with a new set of words, to add more semantic value by using information extracted from webpages of the same … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
3
0
1

Year Published

2015
2015
2021
2021

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 12 publications
(4 citation statements)
references
References 9 publications
0
3
0
1
Order By: Relevance
“…It also incorporates central sentences from articles found in Wikipedia that are linked with tweet, and lastly, to improve the performance of the fusion, it uses the resultant clusters retrieved from the expanded micro-blog based on the cluster. In Mizzaro et al [21] a method that uses information extracted from the web derived from the same temporal context was proposed. In this method, Wikipedia which acts as an external resource is used to query the words.…”
Section: Related Workmentioning
confidence: 99%
“…It also incorporates central sentences from articles found in Wikipedia that are linked with tweet, and lastly, to improve the performance of the fusion, it uses the resultant clusters retrieved from the expanded micro-blog based on the cluster. In Mizzaro et al [21] a method that uses information extracted from the web derived from the same temporal context was proposed. In this method, Wikipedia which acts as an external resource is used to query the words.…”
Section: Related Workmentioning
confidence: 99%
“…The problem of data sparsity in short-text analysis is often handled by contextual enrichment methods. Such methods exploit external sources of semantic knowledge to extend the sparse features of short-text with additional information to make it appear like a long text or a heterogeneous document [37], [38], [39]. Based on this analogy, we consider a set of contextual enrichment methods, that are typically used in short-text analysis, to contextually enrich software requirements with domainspecific data derived from Wikipedia.…”
Section: B Requirements Textmentioning
confidence: 99%
“…In this paper the two primary sources for constructing enriched BoW have been identified as Legal statute pertaining to dowry acts [304B, 498, 256] and the vast knowledge accrued by the legal experts over a period of time. The reason for choosing Legal Statute as the one of the external knowledge source [4,5] for constructing the enriched BoW is that statute happens to be the basis for the different sections of IPC (Indian Penal Code). The enriched BoW thus created is a semantic BoW which can be used as a major source of metadata for the researchers whose research area happens to be dowry cases.…”
Section: Introductionmentioning
confidence: 99%