Proceedings of the 3rd Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities An 2019
DOI: 10.18653/v1/w19-2507
|View full text |Cite
|
Sign up to set email alerts
|

Stylometric Classification of Ancient Greek Literary Texts by Genre

Abstract: Classification of texts by genre is an important application of natural language processing to literary corpora but remains understudied for premodern and non-English traditions. We develop a stylometric feature set for ancient Greek that enables identification of texts as prose or verse. The set contains over 20 primarily syntactic features, which are calculated according to custom, language-specific heuristics. Using these features, we classify almost all surviving classical Greek literature as prose or vers… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
3
3
1

Relationship

1
6

Authors

Journals

citations
Cited by 8 publications
(4 citation statements)
references
References 22 publications
0
4
0
Order By: Relevance
“…In related work, we have developed a similar feature set for Ancient Greek, which has been used to classify prose and verse and, at a more finegrained level, epic and drama (Gianitsos et al, 2019). Our work on Old English has demonstrated the utility of related features for various literary and attribution studies (Neidorf et al, 2019).…”
Section: Discussionmentioning
confidence: 99%
“…In related work, we have developed a similar feature set for Ancient Greek, which has been used to classify prose and verse and, at a more finegrained level, epic and drama (Gianitsos et al, 2019). Our work on Old English has demonstrated the utility of related features for various literary and attribution studies (Neidorf et al, 2019).…”
Section: Discussionmentioning
confidence: 99%
“…In recent years, several works have proposed traditional machine learning approaches to the study of ancient texts. This body of work has focused on optical character recognition and visual analysis [31][32][33][34] , writer identification [35][36][37] and text analysis [38][39][40][41][42][43][44] , stylometrics 45 and document dating 46 . It is only very recently that scholarship has begun to use deep learning and neural networks for optical character recognition [47][48][49][50][51][52][53][54][55] , text analysis 56 , machine translation of ancient texts [57][58][59] , authorship attribution 60,61 and deciphering ancient languages 62,63 , and been applied to study the form and style of epigraphic monuments 64 .…”
Section: Previous Workmentioning
confidence: 99%
“…There are few studies on text classification of Greek literature. Most of them show the language independence of their approach in the genre identification task [8,9,35], or test several feature engineering approaches [7,13]. None of the existed studies dealt with the language variety problem, while as far as we know, none of the existed studies have experimented with text classification in Greek literature leveraging transformerbased models.…”
Section: Related Workmentioning
confidence: 99%
“…To the best of our knowledge, this is the first work to use BERT for classifying Greek literature in general. Most of the past studies put more emphasis on feature engineering [9,13].…”
Section: Introductionmentioning
confidence: 99%