Proceedings of the 2018 Conference of the North American Chapter Of the Association for Computational Linguistics: Hu 2018
DOI: 10.18653/v1/n18-2063
|View full text |Cite
|
Sign up to set email alerts
|

Are Automatic Methods for Cognate Detection Good Enough for Phylogenetic Reconstruction in Historical Linguistics?

Abstract: We evaluate the performance of state-of-theart algorithms for automatic cognate detection by comparing how useful automatically inferred cognates are for the task of phylogenetic inference compared to classical manually annotated cognate sets. Our findings suggest that phylogenies inferred from automated cognate sets come close to phylogenies inferred from expert-annotated ones, although on average, the latter are still superior. We conclude that future work on phylogenetic reconstruction can profit much from … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
32
0

Year Published

2018
2018
2021
2021

Publication Types

Select...
5
2
1

Relationship

4
4

Authors

Journals

citations
Cited by 38 publications
(33 citation statements)
references
References 43 publications
1
32
0
Order By: Relevance
“…Automated cognate detection. There is a rich literature on developing automated cognate detection methods [28,29] for the purpose of detecting cognates and inferring phylogenetic trees [30,31]. The automated cognate detection methods compute a similarity between two words based on hand-crafted phonetic similarity measures [32,33], linear classifiers using word similarity scores [34,35] or phoneme n-grams as features for training [36,37] on hand-annotated training data, and neural networks [38].…”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…Automated cognate detection. There is a rich literature on developing automated cognate detection methods [28,29] for the purpose of detecting cognates and inferring phylogenetic trees [30,31]. The automated cognate detection methods compute a similarity between two words based on hand-crafted phonetic similarity measures [32,33], linear classifiers using word similarity scores [34,35] or phoneme n-grams as features for training [36,37] on hand-annotated training data, and neural networks [38].…”
Section: Methodsmentioning
confidence: 99%
“…Recent research in computational historical linguistics [30,46] has shown that the trees inferred using cognates inferred from automated methods are as good as those inferred from expert annotated cognate judgments. We believe that the next area for application of these cognate detection methods is in linguistic dating, because the dating process has traditionally been heavily dependent on manual cognate detection, which is time consuming, potentially biased, and not yet available for most of the world's language families.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Phonetic similarity measures, however, require phonetic transcriptions to be a priori available. More recently, historical linguists have started exploiting identified cognates to infer phylogenetic relationships across languages (Rama et al, 2018;Jäger, 2018).…”
Section: State Of the Artmentioning
confidence: 99%
“…In a separate paper, Rama et al (2018) presented pruned datasets for five different language families -Pama-Nyungan and Sino-Tibetan in addition to Austronesian, Austro-Asiatic, and Indo-European -consisting of only those languages that show the highest mutual lexical coverage. For each dataset, the authors pruned any language which has less than 75% mutual attestations with the rest of the languages.…”
Section: Effect Of Lexical Coveragementioning
confidence: 99%