2020
DOI: 10.1073/pnas.2003794117
|View full text |Cite
|
Sign up to set email alerts
|

Restoration of fragmentary Babylonian texts using recurrent neural networks

Abstract: The main sources of information regarding ancient Mesopotamian history and culture are clay cuneiform tablets. Many of these tablets are damaged, leading to missing information. Currently, the missing text is manually reconstructed by experts. We investigate the possibility of assisting scholars, by modeling the language using recurrent neural networks and automatically completing the breaks in ancient Akkadian texts from Achaemenid period Babylonia.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
20
0
1

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
2
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 31 publications
(26 citation statements)
references
References 27 publications
0
20
0
1
Order By: Relevance
“…We use two strong baselines: (1) the LSTM model that was proposed by Fetaya et al (2020), and was retrained on our dataset using their default configuration; 4,5 and (2) the cased BERT-base multilingual model, without finetuning over Oracc. 6 We compare these two baselines against our models, as presented in 4.2, trained in three configurations: (1) BERT+AKK(mono) refers to the reduced size BERT model, trained from scratch on the Akkadian texts from Oracc; (2) MBERT+Akk is a finetuned version of M-BERT on the Akkadian texts, using the model's additional free tokens to encode sub-word tokens from Oracc; and…”
Section: Models and Datasetsmentioning
confidence: 99%
See 1 more Smart Citation
“…We use two strong baselines: (1) the LSTM model that was proposed by Fetaya et al (2020), and was retrained on our dataset using their default configuration; 4,5 and (2) the cased BERT-base multilingual model, without finetuning over Oracc. 6 We compare these two baselines against our models, as presented in 4.2, trained in three configurations: (1) BERT+AKK(mono) refers to the reduced size BERT model, trained from scratch on the Akkadian texts from Oracc; (2) MBERT+Akk is a finetuned version of M-BERT on the Akkadian texts, using the model's additional free tokens to encode sub-word tokens from Oracc; and…”
Section: Models and Datasetsmentioning
confidence: 99%
“…Most related to our work, Fetaya et al (2020) designed an LSTM model which similarly aims to complete fragmentary sequences in Babylonian texts. They differ from us in two major aspects.…”
Section: Related Workmentioning
confidence: 99%
“…Besides emotion and happiness, dictionary-driven representations are also extensively used to detect depression in social media [105,158] . Despite the wide adoption of dictionary-driven representations, Jaidka et al [77] made a comparison between unsupervised dictionary-driven and supervised data-driven methods, and verified that the latter is more robust for well-being estimation from social media data. Therefore, outside of the dictionary-driven representations, linguistic features, such as n-gram and BOW, are also applied to represent text in psychology, combined with the supervised machine learning method.…”
Section: A11 Symbol-based Representationmentioning
confidence: 99%
“…Topic model based representation is often operated as one of the features for the classification task, since it can supply the semantic information of text. For instance, Jaidka et al [77] leveraged the representation learned from LDA model to predict the subjective well-being from Twitter. Besides, Eichstaedt et al [51] also used LDA to represent posts on Facebook and predict the depression of users.…”
mentioning
confidence: 99%
“…The aforementioned process also highlights other significant start-up costs that come with AIG: first, in the form described above, AIG is anything but automatic and extensive preparatory steps are required before any automation takes place (Kosh et al, 2019). Second, in order to actually use computer algorithms, one must either be able to train them, which requires expertise and large computational storing and processing capacities (Fetaya et al, 2020)-or gain access to existing software-which has hitherto not been widely available (Royal et al, 2018).…”
Section: Automated Item Generation (Aig)mentioning
confidence: 99%