The results reported in this paper indicate that native speakers of Mandarin Chinese rate the perceptual similarities among the lexical tones of Mandarin differently than do native speakers of American English. Mandarin listeners were sensitive to tone contour while English listeners attended to pitch levels. Chinese listeners also rated tones that are neutralized by phonological tone sandhi rules in Mandarin as more similar to each other than did English speakers – indicating a role of phonology in determining perceptual salience. In two further experiments, we found that some of these differences were eliminated when the listening task focused listeners’ attention on the auditory properties of the stimuli, but, interestingly, a degree of language specificity remained even in the most purely psychophysical listening tasks with speech stimuli.
Background / introduction: SAR image automatic target recognition technology (SAR-ATR) is one of the research hotspots in the field of image cognitive learning. Inspired by the human cognitive process, experts have designed convolutional neural networks (CNN) based methods and successfully applied the methods to SAR-ATR. However, the performance of CNNs significantly deteriorates when the labelled samples are insufficient. Methods: To effectively utilize the unlabelled samples, a semi-supervised CNN method is proposed in this paper. First, CNN is used to extract the features of the samples, and subsequently the class probabilities of the unlabelled samples are computed using the softmax function. To improve the effectiveness of the unlabelled samples, we remove possible noise performing thresholding on the class probabilities. Afterwards, based on the remaining class probabilities, the information contained in the unlabelled samples is integrated with the scatter matrices of the standard linear discriminant analysis (LDA) method. The loss function of CNN consists of a supervised component and an unsupervised component, where the supervised component is created using the cross-entropy function and the unsupervised component is created using the scatter matrices. The class probabilities are utilized to control the impact of the unlabelled samples in the training process, and the reliability of the unlabelled samples is further improved. Results: We choose ten types of targets from the Moving and Stationary Target Acquisition and Recognition (MSTAR) dataset. The experimental results show that the recognition accuracy of 2 our method is significantly higher than that of the supervised CNN method. Conclusions: It proves that our method can effectively improve the SAR-ATR accuracy despite the deficiency of the labelled samples.
This chapter describes the initial stages of development of a Pan-Mandarin ToBI system. It reviews the salient prosodic characteristics of Mandarin, such as lexical tones, tone sandhi, tonal neutralisation, stress patterns, pitch range effects, and prosodic groupings above the syllable level. Particular attention is paid to the range of variability within a common structural core, in addition to points of reference to other varieties of Chinese and to other languages. It then proposes a codification of conventions for marking prosodic structure and an inventory of tones in two standard varieties (i.e. Putonghua of Mainland China and Guoyu of Taiwan) and one regional variety of the language (i.e. Rugaohua, a Jianghuai Mandarin variety). Also built into the system is the capability to accommodate interactions, such as code-switching events, between different varieties of Mandarin and perhaps between Mandarin and other varieties of Chinese (and other languages) in different social contexts.
The present study investigated tone perception by speakers of Taiwanese Southern Min and those of American English with an AX discrimination task. Two Taiwanese Southern Min tone continua were constructed from natural speech stimuli. One continuum ranged from a high level tone (T55) to a mid level tone (T33), and the other from a high level (T55) to a high falling tone (T51). The results showed that perception by Taiwanese listeners was quasi-categorical for the contour-level tone continuum but mostly continuous for the level tone-level tone one. This suggests that the findings by Abramson (J Acoust Soc Am 61:S66, 1979a; In: Lindblom B, Ö hman S (eds) Frontiers of speech communication, 1979b) and Wang (Ann N Y Acad Sci 280:61-72, 1976) and Chan et al. (J Acoust Soc Am 58:S119, 1975) should be seen as complementary to each other rather than contradictory. Differences on tone perception between Taiwanese and English listeners were also found. Taiwanese listeners exhibited a region of higher discriminability on the T55-T51 continuum, while no discrimination peak was observed in English listeners' data. In addition, Taiwanese listeners were more accurate than English listeners in tone discrimination. These results indicate a qualitative difference in lexical tone perception between tone and nontone language listeners: tone language listeners appear to perceive tones as phonemic categories, utilizing cues such as pitch contour, while nontone language listeners rely more on psychoacoustic factors such as pitch height (cf.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.