Automatic speech recognition, which was considered to be a concept of science fiction and which has been hit by number of performance degrading factors, is now an important part of information and communication technology. Improvements in the fundamental approaches and development of new approaches by researchers have lead to the advancement of ASRs which were just responding to a set of sounds to sophisticated ASRs which responds to fluently spoken natural language. Using artificial neural networks (ANNs), mathematical models of the low-level circuits in the human brain, to improve speech-recognition performance, through a model known as the ANN-Hidden Markov Model (ANN-HMM) have shown promise for large-vocabulary speech recognition systems. Achieving higher Recognition accuracy, low Word error rate, developing speech corpus depending upon the nature of language and addressing the issues of sources of variability through approaches like Missing Data Techniques & Convolutive Non-Negative Matrix Factorization, are the major considerations for developing an efficient ASR. In this paper, an effort has been made to highlight the progress made so far for ASRs of different languages and the technological perspective of automatic speech recognition in countries like China, Russian,
Punjabi language is a tonal language belonging to an IndoAryan language family and has number of speakers all around the world. Punjabi language has gained acceptability in the media & communication and thereby deserves to get a pace in the growing field of automatic speech recognition which has been explored already for number of other Indian and foreign languages successfully. Some work has been done in the field of isolated word and connected word speech recognition for Punjabi language. Acoustic template matching and Vector quantization have been the supporting techniques. Continuous speech recognition is one area where no work has been done so far for Punjabi language. In this paper, an effort has been made to build automatic speech recognizer to recognize continuous speech sentences by using Tri-Phone based acoustic modeling approach on HTK 3.4.1 speech engine. Overall recognition accuracy has been found to be 82.18% at sentence level and 94.32% at word level.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.