2019
DOI: 10.1016/j.procs.2019.12.233
|View full text |Cite
|
Sign up to set email alerts
|

Transfer Learning with AudioSet to Voice Pathologies Identification in Continuous Speech

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
19
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 29 publications
(20 citation statements)
references
References 9 publications
0
19
0
Order By: Relevance
“…Also, there is not much work done for voice pathology using a convolutional neural network. Only Guedes et al [18] designed a system and reported an accuracy of 80%, and Zhang et al [19] also use the DNN model which was machine learning where outcomes were missing. So after a detailed literature review, it was concluded that a novel system can be proposed using pitch, 13 MFCC, rolloff, ZCR, energy entropy, spectral flux, spectral centroid, and energy as features and RNN as a classifier ).…”
Section: Related Workmentioning
confidence: 99%
“…Also, there is not much work done for voice pathology using a convolutional neural network. Only Guedes et al [18] designed a system and reported an accuracy of 80%, and Zhang et al [19] also use the DNN model which was machine learning where outcomes were missing. So after a detailed literature review, it was concluded that a novel system can be proposed using pitch, 13 MFCC, rolloff, ZCR, energy entropy, spectral flux, spectral centroid, and energy as features and RNN as a classifier ).…”
Section: Related Workmentioning
confidence: 99%
“…Instead, this paper focuses on two publicly available databases: the Saarbruechen Voice Database (SVD) [12][13][14][15][16] and Voice ICar fEDerico II (VOICED) [16][17][18]. The following is a summary of existing approaches applied to the SVD.…”
Section: Literature Reviewmentioning
confidence: 99%
“…The SMO based SVM yielded the best performance in accuracy (0.858), sensitivity (0.876), and specificity (0.839). Guedes et al [14] proposed two approaches, long short-term memory (LSTM) and convolutional neural network (CNN), for differentiation between healthy and dysphonic candidates, healthy and laryngitic candidates, and healthy and paralyzed candidates. The achieved precision values were 0.66, 0.67, and 0.78, respectively.…”
Section: Literature Reviewmentioning
confidence: 99%
See 2 more Smart Citations