Dialect classification is a classical problem in traditional dialectology. In the course of the last few decades, several quantitative approaches have been suggested as solutions for this problem, one of which uses "Levenshtein distance" for measuring linguistic distances between dialects. In the present paper we shall introduce the Levenshein algoritm as well as two methods with which the results of the measuring can be analyzed, viz. multidimensional scaling and clustering. Then we shall apply these methods to the Bulgarian language area and present a quantitative classification of Bulgarian dialects. Finally, we shall compare the classification obtained to the most widely accepted traditional Bulgarian dialect map, analyze the similarities and differences and evaluate our method.
The calculation of aggregate linguistic distances can compensate for some of the drawbacks inherent to the isogloss bundling method used in traditional dialectology to identify dialect areas. Synchronic aggregate analysis can also point out differences with respect to a diachronically based classification of dialects. In this study the Levenshtein algorithm is applied for the first time to obtain an aggregate analysis of the linguistic distances among 88 diatopic varieties of Croatian spoken along the Eastern Adriatic coast and in the Italian province of Molise. We also measured lexical differences among these varieties, which are traditionally grouped into Čakavian, Štokavian, and transitional Čakavian-Štokavian varieties. The lexical and pronunciational distances are subsequently projected onto multidimensional cartographic representations. Both kinds of analyses confirmed that linguistic discontinuity is characteristic of the whole region, and that discontinuities are more pronounced in the northern Adriatic area than in the south. We also show that the geographic lines are in many cases the most decisive factor contributing to linguistic cohesion, and that the internal heterogeneity within Čakavian is often greater than the differences between Čakavian and Štokavian varieties. This holds both for pronunciation and lexicon. 2 IntroductionOne of the most popular methods applied in traditional geolinguistics (dialectology) is the method of isoglosses, in which areas characterized by different realizations of a single feature are separated by a line -an isogloss. Bundles of such lines were traditionally considered the most important criterion for the division of geolinguistic space into linguistic areas. Despite the tendency to rely on the application of this method in traditional dialectology, even there it has long been recognized that isoglosses do not determine dialectal areas unambiguously because they rarely coincide completely. The isogloss method needs additional assumptions to account for transitional zones and/or dialect continua, even though these are widely recognized to be as common as tightlyknit and readily definable linguistic areas (Chambers & Trudgill, 1998:97).Brozović, who is aware of the problem, argues that in the case of Croatian, because of specific features of the dialectological make-up of this language, the use of traditional isogloss method is nevertheless sometimes justified: "In our linguistic territory we often find the kind of clear-cut dialectal boundaries that older dialectologists could only dream of; these boundaries occur with intense, clear and dense bundles of isoglosses, whereas it has long been clear to dialectologists that such 'ideal' dialectal boundaries are not a common occurrence in language. " (1970:9) 1 . It is our opinion, however, that the division of the Croatian language area into dialect groups is still problematic. This is because although clear-cut dialectal boundaries might be found often in Croatia, they are by no means the rule as Brozović (1970...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.