2020
DOI: 10.1007/s10772-020-09717-8
|View full text |Cite
|
Sign up to set email alerts
|

DNN based continuous speech recognition system of Punjabi language on Kaldi toolkit

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
11
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 18 publications
(11 citation statements)
references
References 20 publications
0
11
0
Order By: Relevance
“…Increasing the number of hidden layers, improve the performance of the network, however, it impose serious issue such as computation cost, network complexity and model overfitting. It has been shown that the deep algorithms employed successfully in a number of fields such as image recognition [55]- [57], speech recognition [58], [59], natural language processing [60], [61] and bioinformatics [62]- [64]. Additionally, it has been presented by several researchers that the DNN demonstrated superior performance over the traditional learning approaches employed for a various complex problems [53] [65].…”
Section: Heterogeneous Feature Vectormentioning
confidence: 99%
“…Increasing the number of hidden layers, improve the performance of the network, however, it impose serious issue such as computation cost, network complexity and model overfitting. It has been shown that the deep algorithms employed successfully in a number of fields such as image recognition [55]- [57], speech recognition [58], [59], natural language processing [60], [61] and bioinformatics [62]- [64]. Additionally, it has been presented by several researchers that the DNN demonstrated superior performance over the traditional learning approaches employed for a various complex problems [53] [65].…”
Section: Heterogeneous Feature Vectormentioning
confidence: 99%
“…The audio module has no built in analysis, nor classification capabilities, as this is deferred to the text processing module. Kaldi [15] is ideal for this transcription task, as it can be integrated at the operating system level, making these audio signals fully available to our solution. Capturing audio signals is independent from screen capturing, which is why the audio and image analysis tasks are naturally divided.…”
Section: A the Audio Modulementioning
confidence: 99%
“…Kaldi is appropriate for the child protection context mainly because it is flexible in controlling all parts of the speech-totext conversion and could easily adapt to different noisy environments by integrating different acoustic modelling scripts FIGURE 3. The design of the transcription system [15] at the operating system level. This approach is also flexible, meaning that it could be used locally, in an edge-computing or cloud-computing architecture.…”
Section: A the Audio Modulementioning
confidence: 99%
“…In addition, several techniques were proposed by the researchers to improve the acoustic variabilities. Different front-end feature extraction techniques perceptual linear prediction (PLP), spectrum-based feature extraction, and Mel-Frequency cepstral coefficients (MFCC) have been used to extract the acoustic features [1,9,[12][13][14]. Researchers have also made minor changes in the feature extraction process implemented in the front-end, these pitch features are also used for improving the speech recognition rate [10,13,[15][16][17].…”
Section: Related Workmentioning
confidence: 99%