Disentangling dialects: a neural approach to Indo-Aryan historical phonology and subgrouping

Cathcart, Chundra; Rama, Taraka

doi:10.18653/v1/2020.conll-1.50

Cited by 5 publications

(5 citation statements)

References 29 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…As a demonstration of the usability of the dataset for computational historical linguistics, we replicate the reflex prediction task of Cathcart and Rama (2020). We train neural models on the task of reflex prediction in Indo-Aryan languages, i.e.…”

Section: Methodsmentioning

confidence: 99%

See 1 more Smart Citation

Computational Historical Linguistics and Language Diversity in South Asia

Arora¹,

Farris²,

Basu³

et al. 2022

Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

View full text Add to dashboard Cite

South Asia is home to a plethora of languages, many of which severely lack access to new language technologies. This linguistic diversity also results in a research environment conducive to the study of comparative, contact, and historical linguistics-fields which necessitate the gathering of extensive data from many languages. We claim that data scatteredness (rather than scarcity) is the primary obstacle in the development of South Asian language technology, and suggest that the study of language history is uniquely aligned with surmounting this obstacle. We review recent developments in and at the intersection of South Asian NLP and historical-comparative linguistics, describing our and others' current efforts in this area. We also offer new strategies towards breaking the data barrier.

show abstract

Section: Methodsmentioning

confidence: 99%

“…Other South Asian cognate databases. Cathcart (2019aCathcart ( ,b, 2020 and Cathcart and Rama (2020) also previously made use of data from Turner (1962)(1963)(1964)(1965)(1966) by scraping the version hosted online by Digitial Dictionaries of South Asia.…”

Section: Introductionmentioning

confidence: 99%

Computational Historical Linguistics and Language Diversity in South Asia

Arora¹,

Farris²,

Basu³

et al. 2022

Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

View full text Add to dashboard Cite

show abstract

“…Some of these sources have been used in previous work on South Asian historical linguistics, e.g. Cathcart and Rama (2020); Cathcart (2019bCathcart ( ,a, 2020-this is the first attempt to consolidate them. Note some previous work in this direction: while the SARVA project (Southworth, 2005) did not reach fruition, a searchable database of Dravidian cognates was developed by Suresh Kolichala under its auspices.…”

Section: Jambu Etymological Databasementioning

confidence: 99%

Computational historical linguistics and language diversity in South Asia

Arora¹,

Farris²,

Basu³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…Sanskrit /kùa:r@/ > Hindi / > tS h a:r/ 'ashes' as well as /k h a:r/ 'alkali' (Masica, 1993). The variability of these sound changes has recently been used to statistically model dialect components in IA languages (Cathcart, 2019a(Cathcart, ,b, 2020Cathcart and Rama, 2020).…”

Section: Indo-aryan Sound Changesmentioning

confidence: 99%

Bhāṣācitra: Visualising the dialect geography of South Asia

Arora¹,

Farris²,

Gopalakrishnan³

et al. 2021

Proceedings of the 2nd International Workshop on Computational Approaches to Historical Language Change 2021

View full text Add to dashboard Cite

We present Bhās .ā citra, 1 a dialect mapping system for South Asia built on a database of linguistic studies of languages of the region annotated for topic and location data. We analyse language coverage and look towards applications to typology by visualising example datasets. The application is not only meant to be useful for feature mapping, but also serves as a new kind of interactive bibliography for linguists of South Asian languages.

show abstract

Disentangling dialects: a neural approach to Indo-Aryan historical phonology and subgrouping

Cited by 5 publications

References 29 publications

Computational Historical Linguistics and Language Diversity in South Asia

Computational Historical Linguistics and Language Diversity in South Asia

Computational historical linguistics and language diversity in South Asia

Bhāṣācitra: Visualising the dialect geography of South Asia

Contact Info

Product

Resources

About