Graph-Based Siamese Network for Authorship Verification

Embarcadero-Ruiz, Daniel; Gómez-Adorno, Helena; Embarcadero-Ruiz, Alberto; Sierra, Gerardo

doi:10.3390/math10020277

Cited by 10 publications

(5 citation statements)

References 27 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…One of the difficulties in comparing prior work is the use of different performance metrics. Some examples are accuracy (Altakrori et al, 2021;Stamatatos, 2018;Jafariakinabad and Hua, 2022;Fabien et al, 2020;Saedi and Dras, 2021;Zhang et al, 2018;Barlas and Stamatatos, 2020), F1 (Murauer and Specht, 2021), C@1 (Bagnall, 2015), recall (Lagutina, 2021), precision (Lagutina, 2021), macro-accuracy (Bischoff et al, 2020), AUC (Bagnall, 2015;Pratanwanich and Lio, 2014), R@8 (Rivera-Soto et al, 2021), and the unweighted average of F1, F0.5u, C@1, and AUC (Manolache et al, 2021;Kestemont et al, 2021;Tyo et al, 2021;Futrzynski, 2021;Peng et al, 2021;Bönninghoff et al, 2021;Boenninghoff et al, 2020;Embarcadero-Ruiz et al, 2022;Weerasinghe et al, 2021).…”

Section: Metricsmentioning

confidence: 99%

Valla: Standardizing and Benchmarking Authorship Attribution and Verification Through Empirical Evaluation and Comparative Analysis

Tyo,

Dhingra,

Lipton

2023

Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacifi

View full text Add to dashboard Cite

Despite decades of research on authorship attribution (AA) and authorship verification (AV), inconsistent dataset splits/filtering and mismatched evaluation methods make it difficult to assess the state of the art. In this paper, we present a survey of the fields, resolve points of confusion, introduce VALLA that standardizes and benchmarks AA/AV datasets and metrics, provide a large-scale empirical evaluation, and provide apples-to-apples comparisons between existing methods. We evaluate eight promising methods on fifteen datasets (including distribution shifted challenge sets) and introduce a new dataset based on texts archived by Project Gutenberg. Surprisingly, we find that a traditional Ngram-based model performs best on 5 (of 7) AA tasks, achieving an average macro-accuracy of 76.50% (compared to 66.71% for a BERT-based model). However, on the two AA datasets with the greatest number of words per author, as well as on the AV datasets, BERT-based models perform best. While AV methods are easily applied to AA, they are seldom included as baselines in AA papers. We show that through the application of hard-negative mining, AV methods are competitive alternatives to AA methods. VALLA and all experiment code can be found here: https://github.com/JacobTyo/Valla

show abstract

Section: Metricsmentioning

confidence: 99%

Valla: Standardizing and Benchmarking Authorship Attribution and Verification Through Empirical Evaluation and Comparative Analysis

Tyo,

Dhingra,

Lipton

2023

Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacifi

View full text Add to dashboard Cite

show abstract

“…Table 6 shows the BigBird Cross-Encoder performance using these training datasets compared to the official results of the PAN20/21 challenge top participant systems (Bevendorff and et al, 2021). These include hybrid neural-probabilistic, neural network-based, logistic regression, and graphbased Siamese network systems (Boenninghoff et al, 2020(Boenninghoff et al, , 2021Weerasinghe and Greenstadt, 2020;Embarcadero-Ruiz et al, 2021). Note here the systems submitted by the same team are not necessarily the same across PAN20 and PAN21 because some systems used for the PAN20 closedset challenge relied on fandom information.…”

Section: Bigbirdmentioning

confidence: 99%

Improving Long-Text Authorship Verification via Model Selection and Data Tuning

Nguyen,

Dagli,

Alperin

et al. 2023

Proceedings of the 7th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities An

View full text Add to dashboard Cite

Authorship verification is used to link texts written by the same author without needing a model per author, making it useful for deanonymizing users spreading text with malicious intent. Recent advances in Transformerbased language models hold great promise for author verification, though short context lengths and non-diverse training regimes present challenges for their practical application. In this work, we investigate the effect of these challenges in the application of a Cross-Encoder Transformer-based author verification system under multiple conditions. We perform experiments with four Transformer backbones using differently tuned variants of fanfiction data and found that our BigBird pipeline outperformed Longformer, RoBERTa, and ELECTRA and performed competitively against the official top ranked system from the PAN evaluation. We also examined the effect of authors and fandoms not seen in training on model performance. Through this, we found fandom has the greatest influence on true trials, pairs of text written by the same author, and that a balanced training dataset in terms of class and fandom performed the most consistently.

show abstract

“…The second column of Table 1 presents the number of words per language sub-collection, totaling in 58,061,996 for these 7 languages, while the third column contains the number of tokens, totaling in 73,692,461. For the purpose of this experiment, we produced four document representations for each novel, each in the form of vertical texts, consisting of: (1) words (as in vertical original text of the novel), ( 2) lemmas (as in vertical lemmatized text), (3) PoS tags (each token in verticalized text is replaced by its PoS tag) and ( 4) masked text, where tokens were substituted with PoS tag for following PoS tags: ADJ, NOUNS, NPROP, ADV, VERB, AUX, NUM, SYM, X, for PoS tags: DET and PRON tokens are substituted with lemma, while others: ADP, CCONJ, INTJ, PART, PUNCT, SCONJ remained unchanged, as inspired by [35].…”

Section: Datasetmentioning

confidence: 99%

“…Recently, however, for some highly-inflected languages, most frequent lemmas emerged as a better alternative to most frequent words [39]. The PoS tags and the document representation with masked words, where PoS labels are used to mask predefined set of PoS classes, also achieved good results for specific problems [35]. In evaluation of this experiment we used the following document representations: most frequent words, lemmas, PoS trigrams, and PoS-masked bigrams (D word , D lemma , D pos and D masked ), as the secondary baseline methods.…”

Section: Baselinementioning

confidence: 99%

Parallel Stylometric Document Embeddings with Deep Learning Based Language Models in Literary Authorship Attribution

et al. 2022

View full text Add to dashboard Cite

This paper explores the effectiveness of parallel stylometric document embeddings in solving the authorship attribution task by testing a novel approach on literary texts in 7 different languages, totaling in 7051 unique 10,000-token chunks from 700 PoS and lemma annotated documents. We used these documents to produce four document embedding models using Stylo R package (word-based, lemma-based, PoS-trigrams-based, and PoS-mask-based) and one document embedding model using mBERT for each of the seven languages. We created further derivations of these embeddings in the form of average, product, minimum, maximum, and l2 norm of these document embedding matrices and tested them both including and excluding the mBERT-based document embeddings for each language. Finally, we trained several perceptrons on the portions of the dataset in order to procure adequate weights for a weighted combination approach. We tested standalone (two baselines) and composite embeddings for classification accuracy, precision, recall, weighted-average, and macro-averaged F1-score, compared them with one another and have found that for each language most of our composition methods outperform the baselines (with a couple of methods outperforming all baselines for all languages), with or without mBERT inputs, which are found to have no significant positive impact on the results of our methods.

show abstract

Graph-Based Siamese Network for Authorship Verification

Cited by 10 publications

References 27 publications

Valla: Standardizing and Benchmarking Authorship Attribution and Verification Through Empirical Evaluation and Comparative Analysis

Valla: Standardizing and Benchmarking Authorship Attribution and Verification Through Empirical Evaluation and Comparative Analysis

Improving Long-Text Authorship Verification via Model Selection and Data Tuning

Parallel Stylometric Document Embeddings with Deep Learning Based Language Models in Literary Authorship Attribution

Contact Info

Product

Resources

About