2021
DOI: 10.3390/s21093279
|View full text |Cite
|
Sign up to set email alerts
|

Toward an Automatic Quality Assessment of Voice-Based Telemedicine Consultations: A Deep Learning Approach

Abstract: Maintaining a high quality of conversation between doctors and patients is essential in telehealth services, where efficient and competent communication is important to promote patient health. Assessing the quality of medical conversations is often handled based on a human auditory-perceptual evaluation. Typically, trained experts are needed for such tasks, as they follow systematic evaluation criteria. However, the daily rapid increase of consultations makes the evaluation process inefficient and impractical.… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
7
1

Relationship

1
7

Authors

Journals

citations
Cited by 13 publications
(7 citation statements)
references
References 35 publications
0
7
0
Order By: Relevance
“…The process to obtain a Mel-spectogram is that the spectrogram is derived by short-term Fourier transform and then the resulting spectrogram is transformed into the human perception scale [16]. In order to extract spectrogram characteristics from an audio signal, it must first be divided into brief overlapping windowing, transformed into the frequency domain using the Fourier transform, and then generated into an envelope spectrogram using a Mel filter bank (see Figure 2) [17]. MFCC, which stands for Mel frequency spectrum per frame, is a type of signal spectrum that can be obtained collectively [13].…”
Section: Methodsmentioning
confidence: 99%
“…The process to obtain a Mel-spectogram is that the spectrogram is derived by short-term Fourier transform and then the resulting spectrogram is transformed into the human perception scale [16]. In order to extract spectrogram characteristics from an audio signal, it must first be divided into brief overlapping windowing, transformed into the frequency domain using the Fourier transform, and then generated into an envelope spectrogram using a Mel filter bank (see Figure 2) [17]. MFCC, which stands for Mel frequency spectrum per frame, is a type of signal spectrum that can be obtained collectively [13].…”
Section: Methodsmentioning
confidence: 99%
“…Data sets comprising recordings of clinician–patient dialogues have been used for assessing communication performance and the factors that affect it in medical consultations. Audio recordings have been used in the healthcare domain for content analysis, for assessing the effectiveness of face-to-face to telephone and video consultations [ 18 , 19 ], and for detecting Alzheimer’s dementia [ 20 ] through analysis of prosodic and paralinguistic features of spontaneous speech, among other applications. The combination of audio and visual data is often used in academic assessments of the performance of medical students and trainee general practitioners (GPs) in medical consultations, and in research on communication in healthcare, such as research on the impact of the introduction of electronic health records in GPs’ offices by monitoring the GP’s focus of attention [ 21 ].…”
Section: Related Workmentioning
confidence: 99%
“…On the other hand, the F-measure is the harmonic mean of precision and recall [61], which is calculated by the following equation:…”
Section: Evaluation Measuresmentioning
confidence: 99%
“…where the Precision is the percentage of correct predictions for a class relative to all the predictions of the same class [61], and the Recall is the percentage of correct predictions for a class relative to all instances that actually belong to the class [62].…”
Section: Evaluation Measuresmentioning
confidence: 99%