2010 Second Cybercrime and Trustworthy Computing Workshop 2010
DOI: 10.1109/ctc.2010.17
|View full text |Cite
|
Sign up to set email alerts
|

Authorship Attribution for Twitter in 140 Characters or Less

Abstract: Authorship attribution is a growing field, moving from beginnings in linguistics to recent advances in text mining. Through this change came an increase in the capability of authorship attribution methods both in their accuracy and the ability to consider more difficult problems. Research into authorship attribution in the 19 ℎ century considered it difficult to determine the authorship of a document of fewer than 1000 words. By the 1990s this values had decreased to less than 500 words and in the early 21 cen… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
80
1
1

Year Published

2011
2011
2022
2022

Publication Types

Select...
4
2
2

Relationship

0
8

Authors

Journals

citations
Cited by 97 publications
(82 citation statements)
references
References 22 publications
0
80
1
1
Order By: Relevance
“…We evaluate an LSTM trained on bigrams, since the LSTM produced better results on a small validation set. CHAR: Character and word n-grams have been the core of many AA systems (Stamatatos, 2009;Schwartz et al, 2013;Layton et al, 2010). We tested various n-gram combinations on the small validation set and our final system uses character 2,3,4-grams with logistic regression.…”
Section: N-gram Convolutional Neural Networkmentioning
confidence: 99%
See 2 more Smart Citations
“…We evaluate an LSTM trained on bigrams, since the LSTM produced better results on a small validation set. CHAR: Character and word n-grams have been the core of many AA systems (Stamatatos, 2009;Schwartz et al, 2013;Layton et al, 2010). We tested various n-gram combinations on the small validation set and our final system uses character 2,3,4-grams with logistic regression.…”
Section: N-gram Convolutional Neural Networkmentioning
confidence: 99%
“…With the advent of social media, one can even argue that building systems that work with short texts equally, if not more important than long texts like books. This need is also reflected in the increasing interest in AA of small texts such as tweets and reviews in AA research community (Qian et al, 2015;Schwartz et al, 2013;Layton et al, 2010).…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…More recently, it has been demonstrated that the authorship of twitter messages can be attributed with a certain degree of certainty [11]. Surprisingly, the authors concluded that authorship could be identified at 120 tweets per user, and that more messages would not improve accuracy significantly.…”
Section: Related Workmentioning
confidence: 95%
“…While the majority of research in the computational sciences on Twitter data has focused on issues such as topic detection (Cataldi et al, 2010), event detection, (Weng and Lee, 2011;Sakaki et al, 2010), sentiment analysis, (Kouloumpis et al, 2011), and other tasks based primarily on the topical and/or semantic content of tweets, there is a growing body of work which investigates more subtle forms of information represented in tweets, such as reputation and trustworthiness, (O'Donovan et al, 2012), authorship attribution (Layton et al, 2010;Bhargava et al, 2013) and Twitter spam detection, (Benevenuto et al, 2010).…”
Section: Studies On Twitter Datamentioning
confidence: 99%