Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics - ACL '04 2004
DOI: 10.3115/1218955.1218981
|View full text |Cite
|
Sign up to set email alerts
|

Linguistic profiling for author recognition and verification

Abstract: A new technique is introduced, linguistic profiling, in which large numbers of counts of linguistic features are used as a text profile, which can then be compared to average profiles for groups of texts. The technique proves to be quite effective for authorship verification and recognition. The best parameter settings yield a False Accept Rate of 8.1% at a False Reject Rate equal to zero for the verification task on a test corpus of student essays, and a 99.4% 2-way recognition accuracy on the same corpus.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
63
0

Year Published

2006
2006
2017
2017

Publication Types

Select...
3
3
1

Relationship

0
7

Authors

Journals

citations
Cited by 86 publications
(63 citation statements)
references
References 5 publications
0
63
0
Order By: Relevance
“…A number of studies used the output of syntactic text chunkers and parsers to create features, and found that they could considerably improve results based on traditional word based analysis alone (Baayen et al 1996;Stamatatos et al 2000Stamatatos et al , 2001Gamon 2004;van Halteren 2004;Chaski 2005;Uzuner and Katz 2005;Hirst & Feiguina 2007). Many studies have used the frequencies of short sequences of parts-of-speech (or combinations of parts-of-speech and other classes of words) as a simple method for approximating syntactic features for this purpose (Argamon-Engelson et al 1998;Kukushkina et al 2001;De Vel et al 2001;Koppel et al 2002;Koppel & Schler 2003;Chaski 2005;Koppel et al 2005Koppel et al , 2006avan Halteren et al 2005;Zhao et al 2006;Zheng et al 2006).…”
Section: Syntax and Parts-of-speechmentioning
confidence: 99%
“…A number of studies used the output of syntactic text chunkers and parsers to create features, and found that they could considerably improve results based on traditional word based analysis alone (Baayen et al 1996;Stamatatos et al 2000Stamatatos et al , 2001Gamon 2004;van Halteren 2004;Chaski 2005;Uzuner and Katz 2005;Hirst & Feiguina 2007). Many studies have used the frequencies of short sequences of parts-of-speech (or combinations of parts-of-speech and other classes of words) as a simple method for approximating syntactic features for this purpose (Argamon-Engelson et al 1998;Kukushkina et al 2001;De Vel et al 2001;Koppel et al 2002;Koppel & Schler 2003;Chaski 2005;Koppel et al 2005Koppel et al , 2006avan Halteren et al 2005;Zhao et al 2006;Zheng et al 2006).…”
Section: Syntax and Parts-of-speechmentioning
confidence: 99%
“…A variety of performance measures have been used in previous work on this task including false acceptance and false rejection rates [60,17], accuracy [25,26], recall, precision, F 1 [30], balanced error rate [19], recall-precision graphs [26] macro-average precision and recall [1], and ROC graphs [22]. Unfortunately, these measures are not able to explicitly estimate the ability of an approach to leave problems unanswered-a fact which is crucial in a cost-sensitive task like this.…”
Section: Related Workmentioning
confidence: 99%
“…Previous work on author verification has been evaluated using sample texts in one language only (Greek [60], Dutch [17,30], English [25,26]) and a specific genre (newspaper articles [60], student essays [30], fiction [25], newswire stories [19], poems [19], blogs [26]). Author verification was also included in previous editions of PAN: the author identification task at PAN-2011 included three author verification problems [1], PAN-2013 focused on author verification and provided corpora in English, Greek, and Spanish [22].…”
Section: Related Workmentioning
confidence: 99%
“…Linguistic profiling is a behavioural biometric that attempts to identify and discriminate between users based on linguistic morphology [19]. Linguistic profiling was used for determination of the language variety or genre of a text, or a classification for document routing or inform of the text often based upon content words in language compared to average profil dertaken on this technique as lexical patterns, synta throughout a string of text.…”
Section: Linguistic Profilingmentioning
confidence: 99%
“…Linguistic features such as specific aspe n frequency counts of functions words in linguistics and engineering are used as a text profile, which can then les for groups of texts. Considerable research has been and many types of linguistic features can be profiled s x, semantics, information content or item distribut Many researchers concluded that structural and stylom ols for author identification and verification [19][20][21]. idual biometric approaches is that no single biometric rios.…”
Section: Data Collectionmentioning
confidence: 99%