Proceedings of the Fourth Workshop on Analytics for Noisy Unstructured Text Data 2010
DOI: 10.1145/1871840.1871853
|View full text |Cite
|
Sign up to set email alerts
|

Tokenizing micro-blogging messages using a text classification approach

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
7
0

Year Published

2011
2011
2021
2021

Publication Types

Select...
5
4

Relationship

1
8

Authors

Journals

citations
Cited by 27 publications
(7 citation statements)
references
References 10 publications
0
7
0
Order By: Relevance
“…Bermingham and Smeaton [2] suggest that it is easier to infer the sentiment polarity of micro-blogging posts compared with blogs, which have richer textual content (they also note the similarity between micro-blogging posts and SMS messages). However, the automatic processing of micro-blogging posts can be problematic because of the use of non-standard words and unusual punctuation [9]. Leong, et al [10] use sentiment mining to analyze SMS messages in teaching evaluations.…”
Section: Introductionmentioning
confidence: 99%
“…Bermingham and Smeaton [2] suggest that it is easier to infer the sentiment polarity of micro-blogging posts compared with blogs, which have richer textual content (they also note the similarity between micro-blogging posts and SMS messages). However, the automatic processing of micro-blogging posts can be problematic because of the use of non-standard words and unusual punctuation [9]. Leong, et al [10] use sentiment mining to analyze SMS messages in teaching evaluations.…”
Section: Introductionmentioning
confidence: 99%
“…Tokenization is important step in natural language processing and potentially affects the sentiment analysis of the texts used in the social media (Laboreiro et al, 2010;Bird et al, 2009). Morphologically, Arabic language is very rich , and hence the Arabic tweets text were cleaned by deleting the non-Arabic words characters and special Twitter characters from the sentence.…”
Section: Tokenization Cleaning and Normalizationmentioning
confidence: 99%
“…As pointed out by Laboreiro, Sarmento, Teixeira, and Oliveira (2010), tokenization significantly affects sentiment analysis, especially in the case of social media. Although Ark-tweet-nlp tool (Gimpel et al, 2011) was developed and tested in English, it yields satisfactory results in Czech as well, according to our initial experiments on the Facebook corpus.…”
Section: Preprocessingmentioning
confidence: 99%