Proceedings of the 5th International Conference on Pattern Recognition Applications and Methods 2016
DOI: 10.5220/0005710005970604
|View full text |Cite
|
Sign up to set email alerts
|

Domain Specific Author Attribution based on Feedforward Neural Network Language Models

Abstract: Abstract:Authorship attribution refers to the task of automatically determining the author based on a given sample of text. It is a problem with a long history and has a wide range of application. Building author profiles using language models is one of the most successful methods to automate this task. New language modeling methods based on neural networks alleviate the curse of dimensionality and usually outperform conventional N-gram methods. However, there have not been much research applying them to autho… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
13
0

Year Published

2017
2017
2021
2021

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 12 publications
(13 citation statements)
references
References 14 publications
0
13
0
Order By: Relevance
“…Different ParChoice-variants performed comparably. 19 Unlike A 4 NT, ParChoice achieved high scores (∼ 50) on both large and small datasets. We also compared METEOR scores with each Par-Choice-module applied alone in the ParChoice-LSTM variant.…”
Section: Semantic Retainmentmentioning
confidence: 98%
See 1 more Smart Citation
“…Different ParChoice-variants performed comparably. 19 Unlike A 4 NT, ParChoice achieved high scores (∼ 50) on both large and small datasets. We also compared METEOR scores with each Par-Choice-module applied alone in the ParChoice-LSTM variant.…”
Section: Semantic Retainmentmentioning
confidence: 98%
“…Author attribution via stylometry has traditionally focused on standard machine learning (ML) algorithms and feature engineering [1,22,33,56,68,74], but deep learning methods have become more prominent in recent years [4,9,19,67]. While there is no unanimous agreement on the most effective features [22,24,33], the Writeprints feature set has been widely applied with success [1-3, 15, 47, 53, 74].…”
Section: Introductionmentioning
confidence: 99%
“…They reported an accuracy level of 76.1% using a corpus written by 50 authors where each author has 1,000 samples. Moreover, Ge [8] performed authorship identifcation using the feedforward neural networks. The authors from [8] reported 95% accuracy and noted that this task is too easy to perform due to the availability of huge training data.…”
Section: Deep Learning Based Methods To Authorship Identificationmentioning
confidence: 99%
“…The word tokenization is performed using DeepCut 7 . The rest of the features are calculated using Thai Language Toolkit (TLTK) 8 . We provide an example of Thai text sample in Table 2 and show the computed stylometric feature values in Table 3.…”
Section: Preprocessingmentioning
confidence: 99%
“…Reference [3] used a multi-head recurrent neural network (RNN) as the basic framework. Reference [8] proposed a neural language model based approach. Reference [20] presented a multi-channel convolutional neural network (CNN) model, which integrates character embeddings and word embeddings.…”
Section: Related Workmentioning
confidence: 99%