2022
DOI: 10.3390/app12189181
|View full text |Cite
|
Sign up to set email alerts
|

Spoken Language Identification System Using Convolutional Recurrent Neural Network

Abstract: Following recent advancements in deep learning and artificial intelligence, spoken language identification applications are playing an increasingly significant role in our day-to-day lives, especially in the domain of multi-lingual speech recognition. In this article, we propose a spoken language identification system that depends on the sequence of feature vectors. The proposed system uses a hybrid Convolutional Recurrent Neural Network (CRNN), which combines a Convolutional Neural Network (CNN) with a Recurr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
32
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
5
2
1

Relationship

1
7

Authors

Journals

citations
Cited by 24 publications
(32 citation statements)
references
References 38 publications
0
32
0
Order By: Relevance
“…Most studies on speech recognition use feature types such as MFCC, GFCC, spectrogram, spectral characteristics, PLP, and LPC [11][12][13][14]. However, the most recent methods include the use of joint factor analysis (JFA) and i-vector-based methods [15][16][17].…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Most studies on speech recognition use feature types such as MFCC, GFCC, spectrogram, spectral characteristics, PLP, and LPC [11][12][13][14]. However, the most recent methods include the use of joint factor analysis (JFA) and i-vector-based methods [15][16][17].…”
Section: Related Workmentioning
confidence: 99%
“…Gammatone Filter Bank Cepstral Coefficients (GTCC) are obtained by replacing the triangular filter bank in MFCC with a set of Gammatone filters emphasizing different frequency bands [20,21]. Compared to MFCC, GTCC is reported to be more robust to noise [14]. Prosodic and spectro-temporal features comprise pitch, energy, duration, rhythm, and temporal features.…”
Section: Related Workmentioning
confidence: 99%
“…There is no standard technique that can serve as the gold standard for discriminating between different languages. Additionally, the study of the possible similarities and dissimilarities between Arabic and other languages is urgently needed to improve spoken language identification [14].…”
Section: Literature Reviewmentioning
confidence: 99%
“…The above equation can be described as scalar product among the log spectral energy vector and a vector of weighting factors W F l as in Eq. (7).…”
Section: Feature Extraction Techniquesmentioning
confidence: 99%
“…Ladakhi etc. In 2022, Alashban et al (7) proposed a spoken language identification system that depends on the sequence of feature vectors. The proposed system used a hybrid Convolutional Recurrent Neural Network (CRNN) that combines a Convolutional Neural Network (CNN) with a Recurrent Neural Network (RNN) network, for spoken language identification on seven languages, including Arabic, chosen from subsets of the Mozilla Common Voice (MCV) corpus.…”
Section: Introductionmentioning
confidence: 99%