Enhanced Running Spectrum Analysis for Robust Speech Recognition Under Adverse Conditions: A Case Study on Japanese Speech

Mufungulwa, George; Tsutsui, Hiroshi; Miyanaga, Yoshikazu; Abe, Shunsuke

doi:10.37936/ecti-cit.2017111.81945

Cited by 4 publications

(1 citation statement)

References 15 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The literature introduces an improved model of a multi-scale God convolutional network and uses it to ne-tune events [9][10]. The stacked traditional convolutional neural network has the problem of losing representation at a lower level.…”

Section: Related Workmentioning

confidence: 99%

Sound signal analysis in Japanese speech recognition based on deep learning algorithm

Xiaoxing

2023

Preprint

View full text Add to dashboard Cite

As an important carrier of information, since sound can be collected quickly and is not limited by angle and light, it is often used to assist in understanding the environment and creating information. Voice signal recognition technology is a typical speech recognition application. This article focuses on the voice signal recognition technology around various deep learning models. By using deep learning neural networks with different structures and different types, information and representations related to the recognition of sound signal samples can be obtained, so as to further improve the detection accuracy of the sound signal recognition detection system. Based on this, this paper proposes an enhanced deep learning model of multi-scale neural convolutional network and uses it to recognize sound signals. The CCCP layer is used to reduce the dimensionality of the underlying feature map, so that the units captured in the network will eventually have internal features in each layer, thereby retaining the feature information to the maximum extent, which will form a convolutional multi-scale model in network deep learning Neurons. Finally, the article discusses the related issues of Japanese speech recognition on this basis. This article first uses the font (gra-phonem), that is, all these Japanese kana and common Chinese characters, using a total of 2795 units for modeling. There is a big gap between the experiment and the (BiLSTM-HMM) system. In addition, when Japanese speech is known, it is incorporated into the end-to-end recognition system to improve the performance of the Japanese speech recognition system. Based on the above-mentioned deep learning and sound signal analysis experiments and principles, the final effect obtained is better than the main effect of the Japanese speech recognition system of the latent Markov model and the long-short memory network, thus promoting its development.

show abstract

Section: Related Workmentioning

confidence: 99%

Sound signal analysis in Japanese speech recognition based on deep learning algorithm

Xiaoxing

2023

Preprint

View full text Add to dashboard Cite

show abstract

Sound signal analysis in Japanese speech recognition based on deep learning algorithm

Xiaoxing¹

2023

Int J Syst Assur Eng Manag

View full text Add to dashboard Cite

An Evaluation of Keyword Detection Using ACF of Pitch for Robust Speech Recognition

Tian

Jiang

Tsutsui

et al. 2018

2018 18th International Symposium on Communications and Information Technologies (ISCIT)

View full text Add to dashboard Cite

Enhanced Running Spectrum Analysis for Robust Speech Recognition Under Adverse Conditions: A Case Study on Japanese Speech

Cited by 4 publications

References 15 publications

Sound signal analysis in Japanese speech recognition based on deep learning algorithm

Sound signal analysis in Japanese speech recognition based on deep learning algorithm

Sound signal analysis in Japanese speech recognition based on deep learning algorithm

An Evaluation of Keyword Detection Using ACF of Pitch for Robust Speech Recognition

Contact Info

Product

Resources

About