Edward Gatt scite author profile

Over the past decades, extensive research has been carried out on various possible implementations of automatic speech recognition (ASR) systems. The most renowned algorithms in the field of ASR are the mel-frequency cepstral coefficients and the hidden Markov models. However, there are also other methods, such as wavelet-based transforms, artificial neural networks and support vector machines, which are becoming more popular. This review article presents a comparative study on different approaches that were proposed for the task of ASR, and which are widely used nowadays. † training time increases linearly with increase in vocabulary size [42] † quantisation error in the discrete representation of speech signals [42] † temporal information is ignored [42] PCA † reduction in the feature vector's size, while retaining much of the significant information [131] † robust [59, 60] † computationally expensive for high-dimensional data [8] LDA † maximises the distance between classes, but minimises the within class distance [132] † robust [133] † sample distribution is assumed a priori to be Gaussian [63] † class samples are assumed to have equal variance [63] Classification technique Advantages Disadvantages HMM † able to model time distribution of speech signals [103] † simple to adapt [68] † capable to model a sequence of discrete or continuous symbols [13] † inputs can be of variable length [40] † based on the assumption that the probability of being in a particular state is dependent only on its preceding state, ignoring any long-term dependencies [82] † emission probabilities are arbitrarily chosen; hence, these might not even represent properly the output probabilities of the corresponding state [82] ANN (in general) † good classifiers [16, 45] † highly adequate for pattern recognition applications [16, 45] † self-organising [16, 45] † self-learning [16, 45] † self-adaptive in new environments [16, 45] † robust [7] † based on ERM; hence, prone to over training a local minima problems [45, 103] MLP † good discriminating ability [2] † unable to model time distribution of speech signals [2] † inputs have to be of fixed length [2] † able to deal with small vocabularies only [2] SOM † no a priori information is required for training a SOM [134] † can easily adapt if a new sample is presented to it [134] † capable of parallel computation [134] † SOM algorithm is not well defined mathematically; hence, values for the network parameters need to be found by trial-and-error [134] † ordered mapping obtained after the training phase may be lost when applied in real environments due to frequent adaptations [134] RBF † simple to implement [135] † Good discriminating ability [135] † robust [135] † online learning ability [135] † shift invariant in time [91] RNN † able to model time distribution of speech signals thanks to the feedback connections [95, 103] † complex training algorithm [94] † training algorithm is highly sensitive to any changes [94] FNN † does not need large amount of samples during the learning process [99] † ...

show abstract

Design of a 1.2 V Low Phase Noise 1.6 GHz CMOS Buffered Quadrature Output VCO with Automatic Amplitude Control

Casha

Grech

Micallef

et al. 2006

View full text Add to dashboard Cite

Hardware-based support vector machine for phoneme classification

Cutajar

Gatt

Grech

et al. 2013

View full text Add to dashboard Cite

Performance analysis of an RF MEMS TPoS resonator using FE modelling

Farrugia

Grech

Casha

et al. 2015

View full text Add to dashboard Cite

Air damping of high performance resonating micro-mirrors with angular vertical comb-drive actuators

et al. 2019

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Edward Gatt

Comparative study of automatic speech recognition techniques

Design of a 1.2 V Low Phase Noise 1.6 GHz CMOS Buffered Quadrature Output VCO with Automatic Amplitude Control

Hardware-based support vector machine for phoneme classification

Performance analysis of an RF MEMS TPoS resonator using FE modelling

Air damping of high performance resonating micro-mirrors with angular vertical comb-drive actuators

Contact Info

Product

Resources

About