Pattern analysis based acoustic signal processing: a survey of the state-of-art

Chaki, Jyotismita

doi:10.1007/s10772-020-09681-3

Cited by 19 publications

(10 citation statements)

References 380 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Вираз (25) є частинним випадком критерію (22) за умови, що за необмеженого зростання об'ємів навчальних вибірок М другий доданок у виразі (21) асимптотично прямує до нуля: ,0 , 0 r r r r r R γ → γ = ∀ ≤ . Отже, перехід від правила ( 22) до ( 25) є доцільним за умови спостереження значної асиметрії значень розв'язувальних статистик ( 23), (24).…”

Section: нехайunclassified

“…льніше для прийняття рішень щодо розпізнавання мовних одиниць у мовленнєвому сигналі, параметризованому в парадигмі концепції ( 8)- (10), застосовувати розв'язувальне правило (22), а не (25). Цей тезис буде перевірено у експериментальній частині статті.…”

Section: нехайunclassified

“…Цей тезис буде перевірено у експериментальній частині статті. Припустимо, що під час розпізнавання досліджуваного сигналу за допомогою розв'язувального правила (25) вердикт був помилково винесений на користь гіпотези ( )…”

Section: нехайunclassified

See 2 more Smart Citations

Elements of Methodology of Precision Phonetic Analysis of Oral Phonograms

Danylchuk¹,

Kovtun²,

Nykytenko³

et al. 2022

VVPI

View full text Add to dashboard Cite

Section: нехайunclassified

See 1 more Smart Citation

Elements of Methodology of Precision Phonetic Analysis of Oral Phonograms

Danylchuk¹,

Kovtun²,

Nykytenko³

et al. 2022

VVPI

View full text Add to dashboard Cite

“…Systems of the third type perform phonetic-morphological analysis exclusively based on mathematical methods of machine learning (support vector machines, EM-method, genetic algorithms, Kohonen networks, etc.) [ 24 , 25 , 26 , 27 , 28 , 29 , 30 , 31 , 32 , 33 , 34 ]. Any methods capable of graphemic analysis [ 24 ], the result of which is the automatic or automated formation of phonetic-morphological collections, are acceptable.…”

Section: Introductionmentioning

confidence: 99%

Entropy-Argumentative Concept of Computational Phonetic Analysis of Speech Taking into Account Dialect and Individuality of Phonation

Kovtun

Семенов

2022

Entropy

View full text Add to dashboard Cite

In this article, the concept (i.e., the mathematical model and methods) of computational phonetic analysis of speech with an analytical description of the phenomenon of phonetic fusion is proposed. In this concept, in contrast to the existing methods, the problem of multicriteria of the process of cognitive perception of speech by a person is strictly formally presented using the theoretical and analytical apparatus of information (entropy) theory, pattern recognition theory and acoustic theory of speech formation. The obtained concept allows for determining reliably the individual phonetic alphabet inherent in a person, taking into account their inherent dialect of speech and individual features of phonation, as well as detecting and correcting errors in the recognition of language units. The experiments prove the superiority of the proposed scientific result over such common Bayesian concepts of decision making using the Euclidean-type mismatch metric as a method of maximum likelihood and a method of an ideal observer. The analysis of the speech signal carried out in the metric based on the proposed concept allows, in particular, for establishing reliably the phonetic saturation of speech, which objectively characterizes the environment of speech signal propagation and its source.

show abstract

“…The best performance on ESC-50 is currently held by Sharma et al [39], who used an attention mechanism and a multi-channel network, that was fed many different image descriptors. For a more comprehensive survey of sound classification methods up to the present day, see [40].…”

Section: Introductionmentioning

confidence: 99%

Ensemble of convolutional neural networks to improve animal audio classification

Nanni

Costa

Aguiar

et al. 2020

J AUDIO SPEECH MUSIC PROC.

View full text Add to dashboard Cite

Recently, deep learning classifiers have proven even more robust in pattern recognition and classification than have texture analysis techniques. With the broad availability of relatively inexpensive Graphics Processing Units (GPUs), many researchers have begun applying deep learning techniques to visual representations of acoustic traces. Preselected or handcrafted descriptors, such as LBP, are not necessary for deep learners since they learn salient features during the training phase. Deep learners, moreover, are uniquely suited to handling visual representations of audio because many of the most famous deep classifiers, such as Convolutional Neural Networks (CNN), require matrices as their input. Humphrey and Bello [17, 18] were among the first to apply CNNs to audio images for music classification and, as a result, succeeded in redefining the state of the art in automatic chord detection and recognition. In the same year, Nakashika et al. [19] reported converting spectrograms to GCLM maps to train CNNs to performed music genre classification on the GTZAN dataset [20]. Later, Costa et al. [21] fused a CNN with the traditional pattern recognition framework of training SVMs on LBP features to classify the LMD dataset. These works exceeded traditional classification results on these genre datasets. Up to this point, most work in audio classification has applied the latest advances in machine learning to the problem of sound classification and recognition without modifying the classification process to make it singularly suitable for sound recognition. An early exception to the generic approach is found in the work of Sigtia and Dixon [22], who adjusted CNN parameters and structures in such a way as to reduce the time it took to train a set of audio images. Time reduction was accomplished by replacing

show abstract

Pattern analysis based acoustic signal processing: a survey of the state-of-art

Cited by 19 publications

References 380 publications

Elements of Methodology of Precision Phonetic Analysis of Oral Phonograms

Elements of Methodology of Precision Phonetic Analysis of Oral Phonograms

Entropy-Argumentative Concept of Computational Phonetic Analysis of Speech Taking into Account Dialect and Individuality of Phonation

Ensemble of convolutional neural networks to improve animal audio classification

Contact Info

Product

Resources

About