It is well‐known that Outer Circle English has undergone extensive contact‐induced lexical and grammatical restructuring. Is it possible to use common NLP tools developed for Inner Circle English to process Outer Circle English texts? Here, we report our experience of using the Stanford PoS tagger to tag the Singaporean component of the International Corpus of English (ICE‐SIN). We isolate two major contact‐related causes of tagging errors: (1) lexical and grammatical loans directly borrowed from the local languages; and (2) English‐origin words with new grammatical meanings acquired from the local languages. While the first type may be easy to overcome, the latter type is intractable, creating an extra layer of morphosyntactic complexity. We achieved comparable accuracy rates in the more formal registers, and a lower but still decent 88% in the informal register of private conversations. A tagged ICE‐SIN allows us to investigate lexical and grammatical restructuring at unprecedented levels of detail.
Singapore English is a new variety of English that has developed unique grammatical characteristics due to contact with the heritage languages of Singapore, especially Chinese. In this paper, we document the morphosyntax of negation in Singapore English, using data culled from available databases, including the Singaporean component of the International Corpus of English. There is little doubt that the negation system is inherited from English, but there is strong Chinese influence in the interaction between negation and aspect, and between negation and quantification. Other features of negation will also be described, including the novel use of no and no need, also due to Chinese influence.
It is well-documented that patients with semantic dementia and Alzheimer’s disease present with difficulty in lexical retrieval and reversal of the concreteness effect in nouns and verbs. Little is known about the lexical phenomena before the onset of symptoms. We anticipate that there are linguistic signs in the speech of people who suffer from mild cognitive impairment (MCI), the prodromal stage of dementia. Here, we report the results of a novel corpus-linguistic approach to the early detection of cognitive impairment. We recorded 40 hours of natural, unconstrained speech of 188 English-speaking Singaporeans; 90 are diagnosed with MCI (51 amnestic, 39 nonamnestic), and 98 are cognitively healthy. The recordings yield 327,470 words, which are tagged for parts of speech. We calculate the per-minute speech rates and concreteness scores of nouns and verbs, and of all tagged words, in our dataset. Our analysis shows that the two measures of nouns and verbs identify different subtypes of MCI. Compared with healthy controls, subjects with amnestic MCI produce fewer but more abstract nouns, whereas subjects with nonamnestic MCI produce fewer but more concrete verbs. Cognitive impairment is manifested in ordinary language before the presentation of clinical symptoms, and can be detected through non-invasive corpus-based analysis of natural speech.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.