Proceedings of the 15th Conference of the European Chapter of The Association for Computational Linguistics: Volume 1 2017
DOI: 10.18653/v1/e17-1113
|View full text |Cite
|
Sign up to set email alerts
|

Using support vector machines and state-of-the-art algorithms for phonetic alignment to identify cognates in multi-lingual wordlists

Abstract: Most current approaches in phylogenetic linguistics require as input multilingual word lists partitioned into sets of etymologically related words (cognates). Cognate identification is so far done manually by experts, which is time consuming and as of yet only available for a small number of well-studied language families. Automatizing this step will greatly expand the empirical scope of phylogenetic methods in linguistics, as raw wordlists (in phonetic transcription) are much easier to obtain than wordlists i… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
33
0

Year Published

2017
2017
2021
2021

Publication Types

Select...
4
2
1
1

Relationship

3
5

Authors

Journals

citations
Cited by 42 publications
(40 citation statements)
references
References 24 publications
0
33
0
Order By: Relevance
“…Using machine learning algorithms is not new to the field of linguistics, though it is one of the more recent methods. 1 While these approaches are found in an increasing number of studies in lin-guistics in general, in historical linguistics in particular the method is less used although some studies have been published in this or adjacent fields such as cladistics (Jäger et al, 2017;Jäger and Sofroniev, 2016). Since this approach of predicting sound features by the features in the phonetic environment only works synchronically, the deep neural network used for this needs to be trained on better known phonological features as the basis for predicting unknown features.…”
Section: The Deep Neural Network Approachmentioning
confidence: 99%
“…Using machine learning algorithms is not new to the field of linguistics, though it is one of the more recent methods. 1 While these approaches are found in an increasing number of studies in lin-guistics in general, in historical linguistics in particular the method is less used although some studies have been published in this or adjacent fields such as cladistics (Jäger et al, 2017;Jäger and Sofroniev, 2016). Since this approach of predicting sound features by the features in the phonetic environment only works synchronically, the deep neural network used for this needs to be trained on better known phonological features as the basis for predicting unknown features.…”
Section: The Deep Neural Network Approachmentioning
confidence: 99%
“…We created a goldstandard dataset from the data used in [22] (which is is drawn from the same sources as the data used in [21] but has been manually edited to correct annotation mistakes). Only the 40 ASJP concepts were used.…”
Section: Creating a Goldstandardmentioning
confidence: 99%
“…B-Cubed scores offer a straightforward way to compare partitioning analyses (or cluster analyses) with each other. In the task of automatic cognate detection in computational historical linguistics, for example, B-Cubed scores are frequently used to compare how well an algorithm performs in comparison with a gold standard (Hauer and Kondrak 2011;Jäger, List, and Sofroniev 2017;List, Greenhill, and Gray 2017).…”
Section: Measuring Differences In Reconstruction Systemsmentioning
confidence: 99%