2020
DOI: 10.1016/j.specom.2020.02.006
|View full text |Cite
|
Sign up to set email alerts
|

Multilingual and multimode phone recognition system for Indian languages

Abstract: The aim of this paper is to develop a flexible framework capable of automatically recognizing phonetic units present in a speech utterance of any language spoken in any mode. In this study, we considered two modes of speech: conversation, and read modes in four Indian languages, namely, Telugu, Kannada, Odia, and Bengali. The proposed approach consists of two stages: (1) Automatic speech mode classification (SMC) and(2) Automatic phonetic recognition using mode-specific multilingual phone recognition system (M… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
6
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
1
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(6 citation statements)
references
References 29 publications
0
6
0
Order By: Relevance
“…We, therefore, selected a basic feedforward NN model to study vocal tract dynamics. A similar study on speech classification (Tripathi et al, 2020) also used such a basic NN model. The following subsections explain the elements involved in the classification model.…”
Section: Neural Network-based Classification Modelmentioning
confidence: 99%
See 3 more Smart Citations
“…We, therefore, selected a basic feedforward NN model to study vocal tract dynamics. A similar study on speech classification (Tripathi et al, 2020) also used such a basic NN model. The following subsections explain the elements involved in the classification model.…”
Section: Neural Network-based Classification Modelmentioning
confidence: 99%
“…Thus, lower parameterized NN poorly performs in word classification. Tripathi et al (2020) classified continuous speech into conversational and read modes and later developed a mode-specific phone recognition system. They reported 0.83 speech mode classification accuracy with vocal tract features.…”
Section: Summary and Comparisonmentioning
confidence: 99%
See 2 more Smart Citations
“…Among various existing cepstral features, MFCCs can be regarded as the defacto standard feature set in the area of voice pathology detection. MFCCs have also been widely used as default reference features in many areas outside pathology detection (like speaker recognition [38], speech spoof detection [39], speech mode classification [40], etc.). Moreover, MFCC features are widely included in larger generic feature sets (such as the openSMILE feature set [41], GeMAPS feature set [42], and ComParE feature set [43]) to capture vocal tract information from speech and voice signals.…”
Section: Introductionmentioning
confidence: 99%