This contribution deals with the use of quotations (repeated n-grams) in the works of medieval Arabic literature. The analysis is based on a 420 millions of words historical corpus of Arabic. Based on repeated quotations from work to work, a network is constructed and used for interpretation of various aspects of Arabic literature. Two short case studies are presented, concentrating on the centrality and relevance of individual works, and the analysis of a time depth and resulting impact of a given work in various periods. Quotations and Their DefinitionThe relevance of individual works in a given literature and the time depth of such relevance are of interest for many reasons. There are many methods that can reveal such relevance.The current contribution is based on quotation extraction. Quotations, both covert and overt, both from written and oral sources, belong to constitutive features of medieval Arabic literature.There are genres which heavily depend on establishing credible links among sources, especially the oral ones, where a trusty chain of tradents is crucial for the claims that such chains accompany. Other links may point to the importance of a given work (or its part) and may uncover previously unseen relations within a given literature or a given genre/register, or reveal connections among genres/registers within a given literature. As such, the results are interesting in a wide research range, from linguists or literature theorists to authors interested in the interactions of various subsets of a given literature.The research on quotations, their extraction and detection is rich in the NLP, but the algorihms used are based mainly on the quotation-marker recognition, e.g. Pareti et al. (2013), Pouliquen et al. (2007) and Fernandes et al. (2011), or on the metadata procesing (e.g. Shi et al. 2010), to name just a few examples. It can be said that most of the contributions focus on issues different from the one described in this contribution and choose a different approach.Our understanding of quotations in this project is limited to similar strings of words, i.e. the quotations are very close to borrowings or repetition of verbatim or almost verbatim passages. Technically, it can be viewed as an n-gram that is being repeated in at least two works. These repeated n-grams create links that exhibit some hierarchy, e.g. on the chronological line. The only approach known to us that can be paralleled to ours is the one described in Kolak and Schilit (2008) for quotation mining within the Google Books corpus with algorithm searching for verbatim quotations only.In a different context and without direct inspiration we developed an algorithm that is tolerant to a certain degree of lexical and morphological variation and word order variability. The reason for this tolerance is both the type of the Arabic language (flective morphology and free word order), but also the fact that the quotations in medieval Arabic literature tend not to be very strict. Despite of the fact that the matching is not so rigo...
This paper presents the results of a project designed to functionally test the mutual intelligibility of spoken Maltese, Tunisian Arabic, and Benghazi Libyan Arabic. We compiled an audio-based intelligibility test consisting of three components: a word test where the respondents were asked to perform a semantic classification task with 11 semantic categories, a sentence test where the task was to provide a translation of a sentence into the respondent’s native language, and a text test where a short text was listened to twice and the respondents were asked to answer 8 multiple-choice questions. Data were collected from 24 respondents in Malta, Tunis, and Benghazi. It was found that there exists asymmetric mutual intelligibility between the two mainstream varieties of Maġribī Arabic and Maltese, with speakers of Tunisian and Libyan Arabic able to understand about 40 % of what is being said to them in Maltese, against about 30 % for speakers of Maltese exposed to either variety of Arabic. Additionally, it was found that Tunisian Arabic has the highest level of mutual intelligibility with either of the other two varieties. Combining the intelligibility scores with comparative linguistic data, we were able to sketch out the phonological variables involved in enabling and inhibiting mutual intelligibility for all three varieties of Arabic and set the stage for further research into the subject.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.