2008 International Conference on Audio, Language and Image Processing 2008
DOI: 10.1109/icalip.2008.4590209
|View full text |Cite
|
Sign up to set email alerts
|

Syllable-based automatic arabic speech recognition in noisy enviroment

Abstract: In this paper, syllables are proposed to be used as acoustic units to improve the performance of automatic speech recognition (ASR) systems of Arabic spoken proverbs in noisy environments. To test our proposed approach, a speaker-independent HMM-based speech recognition system was designed using Hidden Markov Model Toolkit (HTK). A series of experiments on noisy speech has been carried out using an Arabic database that consists of fifty-nine Egyptian speakers. The obtained results show that the recognition rat… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
10
0
1

Year Published

2009
2009
2024
2024

Publication Types

Select...
4
3
1

Relationship

1
7

Authors

Journals

citations
Cited by 12 publications
(11 citation statements)
references
References 18 publications
0
10
0
1
Order By: Relevance
“…Experiments in [3] demonstrated that syllables are less sensitive to degradations of the speech signal due to interfering noise or distortions. In a previous work, we showed that the syllablebased ASR rate outperformed the ASR rate obtained using monophones, triphones and words by 2.68%, 1.19% & 1.79% respectively [4]. In this paper, we evaluate the use of syllables for the robustness of the automatic recognition of dysarthric speech.…”
Section: Introductionmentioning
confidence: 92%
“…Experiments in [3] demonstrated that syllables are less sensitive to degradations of the speech signal due to interfering noise or distortions. In a previous work, we showed that the syllablebased ASR rate outperformed the ASR rate obtained using monophones, triphones and words by 2.68%, 1.19% & 1.79% respectively [4]. In this paper, we evaluate the use of syllables for the robustness of the automatic recognition of dysarthric speech.…”
Section: Introductionmentioning
confidence: 92%
“…On the other hand, Automatic Speech Recognition (ASR) is a technology that allows a computer to identify the words that a person speaks into a microphone or telephone with a wide range of applications. Based on that, investigations were made, in reference [13], between monophone, triphone, syllable, and word-based calculations for recognizing Egyptian Arabic digits. Actually, 39 MFCCs were extracted as features for each recorded voice in the database.…”
Section: Related Workmentioning
confidence: 99%
“…In particular, a DCT is a Fourierrelated transform which is basically similar to the discrete Fourier transform (DFT), but using only real numbers [66,71]. As being formulated in (13), DCT is the preferred technique to use, when planning to get back to the time domain, as of having a highly uncorrelated feature. Actually, logarithm condenses a dynamic range of values whereas the minor differences at high amplitudes do not affect the humans much when compared to that at low amplitudes.…”
Section: Step 4 : Mel Filter Bankmentioning
confidence: 99%
“…Many HMMbased ASR systems for continuous Arabic speech have reached various levels of recognition accuracy and encouraging performances which have been achieved [14][15][16][17][18]. The accuracy of recognition is usually measured by the correct percentage of recognized phonemes.…”
Section: Research Efforts Summarymentioning
confidence: 99%