2016
DOI: 10.1609/aaai.v30i1.9924
|View full text |Cite
|
Sign up to set email alerts
|

Authorship Attribution Using a Neural Network Language Model

Abstract: In practice, training language models for individual authors is often expensive because of limited data resources. In such cases, Neural Network Language Models (NNLMs), generally outperform the traditional non-parametric N-gram models. Here we investigate the performance of a feed-forward NNLM on an authorship attribution problem, with moderate author set size and relatively limited data. We also consider how the text topics impact performance. Compared with a well-constructed N-gram baseline method with Knes… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
6
3

Relationship

0
9

Authors

Journals

citations
Cited by 16 publications
(6 citation statements)
references
References 5 publications
0
6
0
Order By: Relevance
“…It was brought out that several features show temporal changes in increasing and decreasing order, and the language model for each author may differ. Also, Ge et al [40] explored language models using Neural Network Language Models (NNLMs) and compared their performance with the n-gram models (i.e., 4-gram). NNLM-based work achieves promising results compared with the N-gram models.…”
Section: Language Modelsmentioning
confidence: 99%
See 1 more Smart Citation
“…It was brought out that several features show temporal changes in increasing and decreasing order, and the language model for each author may differ. Also, Ge et al [40] explored language models using Neural Network Language Models (NNLMs) and compared their performance with the n-gram models (i.e., 4-gram). NNLM-based work achieves promising results compared with the N-gram models.…”
Section: Language Modelsmentioning
confidence: 99%
“…The results indicated a 97% success rate with the Levenberg Marguardtbased classifier. Ge et al [40] used a Feedforward Neural Network to create a lightweight language model that performed better than the baseline n-gram method on a limited dataset. Shrestha et al [118] presented a new model using Convolutional Neural Networks (CNN), which focused on the authorship attribution of short texts.…”
Section: Deep Learning Modelsmentioning
confidence: 99%
“…In order to improve accuracy, researchers are performing experiments on a variety of languages, leveraging diverse data sets, and presenting results with differing degrees of complexity. Ge et al [17] conducted forensic analysis on a vast distribution of Urdu corpus. It is tested with Latent Dirichlet Allocation (LDA) and cosine similarity to detect textual similarity.…”
Section: Literature Reviewmentioning
confidence: 99%
“…The best performing technique was a character-level CNN, which were compared to other conventional approaches. In addition, a feedforward neural network language model was applied by Ge et al [51] to train a classifier to attribute a dataset with only a little amount of data. In comparison to n-gram baselines, the model trained a representation for each word using a window of four grams and achieved an accuracy of 95%.…”
Section: Related Studiesmentioning
confidence: 99%