Proceedings of the 15th Conference of the European Chapter of The Association for Computational Linguistics: Volume 2 2017
DOI: 10.18653/v1/e17-2108
|View full text |Cite
|
Sign up to set email alerts
|

On the Relevance of Syntactic and Discourse Features for Author Profiling and Identification

Abstract: The majority of approaches to author profiling and author identification focus mainly on lexical features, i.e., on the content of a text. We argue that syntactic dependency and discourse features play a significantly more prominent role than they were given in the past. We show that they achieve state-of-the-art performance in author and gender identification on a literary corpus while keeping the feature set small: the used feature set is composed of only 188 features and still outperforms the winner of the … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
8
0
3

Year Published

2017
2017
2023
2023

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 19 publications
(12 citation statements)
references
References 15 publications
0
8
0
3
Order By: Relevance
“…Sundararajan et al argue that, although syntax can be helpful for cross-genre authorship attribution, combining syntax and lexical information can further boost the performance for cross-topic attribution and single-domain attribution [51]. Furthermore, it has been shown that syntactic dependency and discourse features play a significant role in the task of gender and author identification and author verification [47]. Schwartz et al combine lexical and syntactic features and use a linear classifier for writing style detection [42].…”
Section: Traditional Methodsmentioning
confidence: 99%
“…Sundararajan et al argue that, although syntax can be helpful for cross-genre authorship attribution, combining syntax and lexical information can further boost the performance for cross-topic attribution and single-domain attribution [51]. Furthermore, it has been shown that syntactic dependency and discourse features play a significant role in the task of gender and author identification and author verification [47]. Schwartz et al combine lexical and syntactic features and use a linear classifier for writing style detection [42].…”
Section: Traditional Methodsmentioning
confidence: 99%
“…Punctuation is a core part of language that functions quite unlike words; punctuation groups words together or separates them, and contributes to the overall structure and meaning of a phrase or sentence. Punctuation plays an important role in dis- tinguishing between different types of text, such as texts by different authors (Soler-Company and Wanner, 2017) or texts produced by different Twitter communities (Tatman and Paullada, 2017). Embeddings are used to generate punctuation for text that is lacking punctuation, such as recorded transcripts (Yi and Tao, 2019).…”
Section: Punctuationmentioning
confidence: 99%
“…Sundararajan et al argue that, although syntax can be helpful for cross-genre authorship attribution, combining syntax and lexical information can further boost the performance for cross-topic attribution and singledomain attribution [48]. Further studies which combine lexical and syntactic features include [25,42,45]. performance results [18].…”
Section: Related Workmentioning
confidence: 99%