2016 IEEE 16th International Conference on Data Mining (ICDM) 2016
DOI: 10.1109/icdm.2016.0147
|View full text |Cite
|
Sign up to set email alerts
|

A Scalable Framework for Stylometric Analysis Query Processing

Abstract: Stylometry is the statistical analyses of variations in the author's literary style. The technique has been used in many linguistic analysis applications, such as, author profling, authorship identifcation, and authorship verifcation. Over the past two decades, authorship identifcation has been extensively studied by researchers in the area of natural language processing. However, these studies are generally limited to (i) a small number of candidate authors, and (ii) documents with similar lengths. In this pa… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
32
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
3
2

Relationship

3
2

Authors

Journals

citations
Cited by 15 publications
(32 citation statements)
references
References 16 publications
0
32
0
Order By: Relevance
“…However, the main problem of the PkNN method is that the classifier is sensitive to outliers. To mitigate this problem, we use a stylometric data representation, which makes use of set similarity search [36] such that the stylistic variations between documents can be measured as a set distance [21].…”
Section: Limitationmentioning
confidence: 99%
See 4 more Smart Citations
“…However, the main problem of the PkNN method is that the classifier is sensitive to outliers. To mitigate this problem, we use a stylometric data representation, which makes use of set similarity search [36] such that the stylistic variations between documents can be measured as a set distance [21].…”
Section: Limitationmentioning
confidence: 99%
“…On the other hand, in a study which employed machine translation [4], the logistic regression classification has led to superior accuracy using a larger set of features (i.e., word n-grams) in comparison to other classifiers. The nearest neighbor classifier has reported a reasonable accuracy (71%) when used in a corpus with a large number of candidate authors [36].…”
Section: Stylometric Analysis Techniquesmentioning
confidence: 99%
See 3 more Smart Citations