Proceedings - Natural Language Processing in a Deep Learning World 2019
DOI: 10.26615/978-954-452-056-4_094
|View full text |Cite
|
Sign up to set email alerts
|

incom.py – A Toolbox for Calculating Linguistic Distances and Asymmetries between Related Languages

Abstract: Languages may be differently distant from each other and their mutual intelligibility may be asymmetric. In this paper we introduce incom.py, a toolbox for calculating linguistic distances and asymmetries between related languages. incom.py allows linguist experts to quickly and easily perform statistical analyses and compare those with experimental results. We demonstrate the efficacy of incom.py in an incomprehension experiment on two Slavic languages: Bulgarian and Russian. Using incom.py we were able to va… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

2
25
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
3
2

Relationship

3
2

Authors

Journals

citations
Cited by 6 publications
(27 citation statements)
references
References 21 publications
2
25
0
Order By: Relevance
“…As a representation of the (dis-)similarity of the PL stimulus toward CS, a measure referred to as total pronunciation-based distance is determined for the whole sentence, the final 3-g, 2-g, and target word and examined for correlations with intelligibility. The distances are calculated automatically with the help of the incom.py toolbox (Mosbach et al, 2019 ) for each word. Distances of the 2-g, 3-g, and sentences are the mean distances of the individual words they consist of.…”
Section: Methodsmentioning
confidence: 99%
“…As a representation of the (dis-)similarity of the PL stimulus toward CS, a measure referred to as total pronunciation-based distance is determined for the whole sentence, the final 3-g, 2-g, and target word and examined for correlations with intelligibility. The distances are calculated automatically with the help of the incom.py toolbox (Mosbach et al, 2019 ) for each word. Distances of the 2-g, 3-g, and sentences are the mean distances of the individual words they consist of.…”
Section: Methodsmentioning
confidence: 99%
“…In the present study we extend the incom.py toolbox 4 (Mosbach et al, 2019) focusing on mutual intelligibility aspects in oral intercomprehension. First, we compare the available measuring methods for linguistic distances and asymmetries -i.e., Levenshtein distance and word adaptation surprisal -as predictors of mutual intelligibility in auditory perception and add word adaptation entropy as an additional metric for asymmetric intelligibility.…”
Section: This Papermentioning
confidence: 99%
“…Employing a modified Levenshtein algorithm [Levenshtein 1965], which disallows matching between a vowel and a consonant; we have calculated the orthographic and the phonetic 9 distances between 120 BG-RU cognate pairs. This objective measure, we calculated automatically using the incompy tool of [Mosbach et al 2019]. While in the basic form of the algorithm all string operations have the same cost, we use 0 for the cost of mapping a character/sound to itself, e.g.…”
Section: Predictors Of Mutual Intelligibility 41 Levenshtein Distancementioning
confidence: 99%
“…For example, CAS is defined as in (1). Since WAS between two words is computed by summing up the CAS and the SAS values of the contained characters and sounds in the aligned word pair, it strongly depends on the number of available word pairs (for more details see [Mosbach et al 2019], [Stenger 2019]). Finally, we normalize the WAS based on the set of 120 BG-RU cognates.…”
Section: Word Adaptation Surprisalmentioning
confidence: 99%