To rescue and preserve an endangered language, this paper studied an end-to-end speech recognition model based on sample transfer learning for the low-resource Tujia language. From the perspective of the Tujia language international phonetic alphabet (IPA) label layer, using Chinese corpus as an extension of the Tujia language can effectively solve the problem of an insufficient corpus in the Tujia language, constructing a cross-language corpus and an IPA dictionary that is unified between the Chinese and Tujia languages. The convolutional neural network (CNN) and bi-directional long short-term memory (BiLSTM) network were used to extract the cross-language acoustic features and train shared hidden layer weights for the Tujia language and Chinese phonetic corpus. In addition, the automatic speech recognition function of the Tujia language was realized using the end-to-end method that consists of symmetric encoding and decoding. Furthermore, transfer learning was used to establish the model of the cross-language end-to-end Tujia language recognition system. The experimental results showed that the recognition error rate of the proposed model is 46.19%, which is 2.11% lower than the that of the model that only used the Tujia language data for training. Therefore, this approach is feasible and effective.
Objective. To explore the value of artificial intelligence (AI) film reading system based on deep learning in the diagnosis of non-small-cell lung cancer (NSCLC) and the significance of curative effect monitoring. Methods. We retrospectively selected 104 suspected NSCLC cases from the self-built chest CT pulmonary nodule database in our hospital, and all of them were confirmed by pathological examination. The lung CT images of the selected patients were introduced into the AI reading system of pulmonary nodules, and the recording software automatically identified the nodules, and the results were compared with the results of the original image report. The nodules detected by the AI software and film readers were evaluated by two chest experts and recorded their size and characteristics. Comparison of calculation sensitivity, false positive rate evaluation of the NSCLC software, and physician’s efficiency of nodule detection whether there was a significant difference between the two groups. Results. The sensitivity, specificity, accuracy, positive predictive rate, and false positive rate of NSCLC diagnosed by radiologists were 72.94% (62/85), 92.06% (58/63), 81.08% (62+58/148), 92.53% (62/67), and 7.93% (5/63), respectively. The sensitivity, specificity, accuracy, positive prediction rate, and false positive rate of AI film reading system in the diagnosis of NSCLC were 94.12% (80/85), 77.77% (49/63), 87.161% ( 80 + 49 /148), 85.11% (80/94), and 22.22% (14/63), respectively. Compared with radiologists, the sensitivity and false positive rate of artificial intelligence film reading system in the diagnosis of NSCLC were higher ( P < 0.05 ). The sensitivity, specificity, accuracy, positive prediction rate, and negative prediction rate of artificial intelligence film reading system in evaluating the efficacy of patients with NSCLC were 87.50% (63/72), 69.23% (9/13), 84.70% ( 63 + 9 )/85, 94.02% (63/67), and 50% (9/18), respectively. Conclusion. The AI film reading system based on deep learning has higher sensitivity for the diagnosis of NSCLC than radiologists and can be used as an auxiliary detection tool for doctors to screen for NSCLC, but its false positive rate is relatively high. Attention should be paid to identification. Meanwhile, the AI film reading system based on deep learning also has a certain guiding significance for the diagnosis and treatment monitoring of NSCLC.
The polarization of world languages is becoming more and more obvious. Many languages, mainly endangered languages, are of low-resource attribute due to lack of information. Both language conservation and cultural heritage face important challenges. Therefore, speech recognition for lowresource scenario has become a hot topic in the field of speech. Based on the complex network structures and huge model parameters, deep learning has become a powerful science in the process of speech recognition, which has a broad and far-reaching significance for the study of low-resource speech recognition. Aiming at the characteristic of low resource, this paper reviews the history and research status of two kinds of acoustic models of deep learning neural networks and acoustic end-to-end structures. We further elaborate on several key techniques for improving performance in the two aspects of data and model training. There are two projects for low-resource languages introduced in this paper. The possible future developments are finally pointed out. These works provide some reference for computer speech and language processing.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.