2017
DOI: 10.1002/asi.23968
|View full text |Cite
|
Sign up to set email alerts
|

Masking topic‐related information to enhance authorship attribution

Abstract: Authorship attribution attempts to reveal the authors of documents. In recent years, research in this field has grown rapidly. However, the performance of state-of-theart methods is heavily affected when text of known authorship and texts under investigation differ in topic and/or genre. So far, it is not clear how to quantify the personal style of authors in a way that is not affected by topic shifts or genre variations. In this paper, a set of text distortion methods are used attempting to mask topic-related… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
39
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
4
2
2

Relationship

1
7

Authors

Journals

citations
Cited by 33 publications
(39 citation statements)
references
References 38 publications
0
39
0
Order By: Relevance
“…This bias helps representations of vocabulary-based models to be close. However, such features fail to identify authors in cross-domain scenarios (Stamatatos, 2018), while our model focuses less on topic-and semantic-related words but achieves comparable performance (even better for SimRank and the authorship attribution task).…”
Section: In-depth Analysismentioning
confidence: 91%
“…This bias helps representations of vocabulary-based models to be close. However, such features fail to identify authors in cross-domain scenarios (Stamatatos, 2018), while our model focuses less on topic-and semantic-related words but achieves comparable performance (even better for SimRank and the authorship attribution task).…”
Section: In-depth Analysismentioning
confidence: 91%
“…Recently, cross-genre, cross-domain, cross-topic and multi-topic data sets [14,20] have been studied in AA tasks. Most studies on topic and domain issues have focused on cross-domain or cross-topic tasks [2123]. Topic influence has been addressed through generative models where function words and stylometric markers have been used in joint inference of the author and the topic [22].…”
Section: Related Workmentioning
confidence: 99%
“…Topic influence has been addressed through generative models where function words and stylometric markers have been used in joint inference of the author and the topic [22]. On the other hand, studies on removing topic-related information through masking document content has been successfully applied in multi-topic data sets [21]. In another study, a syntactic feature set has been suggested for topic-independent AA [23].…”
Section: Related Workmentioning
confidence: 99%
“…In particular, style of documents can be used to infer their genre or reveal information about their authors (Stamatatos, 2009). Authorship analysis, dealing with the personal style of authors, is a very active research area (Neal et al, 2018;Rocha et al, 2017;Seroussi, Zukerman, & Bohnert, 2014;Stamatatos, 2018).…”
Section: Introductionmentioning
confidence: 99%