2018
DOI: 10.48550/arxiv.1804.05416
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Are Automatic Methods for Cognate Detection Good Enough for Phylogenetic Reconstruction in Historical Linguistics?

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
4
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(4 citation statements)
references
References 0 publications
0
4
0
Order By: Relevance
“…An alternative to LexStat is the OnlinePMI method published by Rama et al (2017). OnlinePMI performed better than LexStat in identifying cognate classes (Rama et al, 2017), but it gave worse results when reconstructing trees using Bayesian phylogenetics on the bases of these classes (Rama et al, 2018).1 Compared to LexStat, the OnlinePMI method is better able to handle data with lower numbers of shared concepts. This comes at the cost of a significant random component, making it harder to optimize OnlinePMI for a particular application.…”
Section: 2mentioning
confidence: 99%
See 1 more Smart Citation
“…An alternative to LexStat is the OnlinePMI method published by Rama et al (2017). OnlinePMI performed better than LexStat in identifying cognate classes (Rama et al, 2017), but it gave worse results when reconstructing trees using Bayesian phylogenetics on the bases of these classes (Rama et al, 2018).1 Compared to LexStat, the OnlinePMI method is better able to handle data with lower numbers of shared concepts. This comes at the cost of a significant random component, making it harder to optimize OnlinePMI for a particular application.…”
Section: 2mentioning
confidence: 99%
“…On the methodological side, recent years have seen a significant improvement of computational tools, practices, models, and algorithms for inferring language histories from word lists. For example, there are now new methods for automatic cognate detection using state-of-the-art pairwise phonetic alignment algorithms (Jäger et al, 2017;List, 2012a;List et al, 2017;List, 2012b;List et al, 2018b) that reach nearly 90% accuracy (B-cubed F-score) in determining the cognate sets to be used in phylogenetic analyses (Rama et al, 2018). Bayesian phylogenetic inference research has empirically shown which models are useful for lexical data Language Dynamics and Change (2022) 1-53 | 10.1163/22105832-bja10019 (Chang et al, 2015;Kolipakam et al, 2018).…”
Section: Introductionmentioning
confidence: 99%
“…Indian language pairs borrow a large number of cognates and false friends due to this shared ancestry. Knowing and utilising these cognates/false friends can help improve the performance of computational phylogenetics (Rama et al, 2018) as well as cross-lingual 2 Released Data: Github Link information retrieval (Meng et al, 2001) in the Indian setting, thus encouraging us to investigate this problem for this linguistic area 3 . Some other applications of cognate detection in NLP have been sentence alignment (Simard et al, 1993;Melamed, 1999), inducing translation lexicons (Mann and Yarowsky, 2001;Tufis, 2002), improving statistical machine translation models (Al-Onaizan et al, 1999), and identification of confusable drug names (Kondrak and Dorr, 2004).…”
Section: Introductionmentioning
confidence: 99%
“…In numerous cases, these words also are morphologically altered as per the Indian language morphological rules to generate new variants of existing words. Detection of such variants or 'Cognates' across languages helps Cross-lingual Information Retrieval (CLIR) (Makin et al, 2008;Meng et al, 2001), Machine Translation (MT) (Kondrak, 2005;Kondrak et al, 2003;Al-Onaizan et al, 1999), and Computational Phylogenetics (Rama et al, 2018). Cognates are etymologically related words across two languages (Crystal, 2011).…”
Section: Introductionmentioning
confidence: 99%