Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 2020
DOI: 10.18653/v1/2020.acl-main.215
|View full text |Cite
|
Sign up to set email alerts
|

MultiQT: Multimodal learning for real-time question tracking in speech

Abstract: We address a challenging and practical task of labeling questions in speech in real time during telephone calls to emergency medical services in English, which embeds within a broader decision support system for emergency call-takers. We propose a novel multimodal approach to real-time sequence labeling in speech. Our model treats speech and its own textual representation as two separate modalities or views, as it jointly learns from streamed audio and its noisy transcription into text via automatic speech rec… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
5
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(5 citation statements)
references
References 32 publications
0
5
0
Order By: Relevance
“…The model is currently developed to English speaking EMS. 27 With more languages added the machinelearning models sensitivity may improved. Another trait of the false negative calls was calls, where the caller did not have access to the patient.…”
Section: False Negative Predictionsmentioning
confidence: 99%
“…The model is currently developed to English speaking EMS. 27 With more languages added the machinelearning models sensitivity may improved. Another trait of the false negative calls was calls, where the caller did not have access to the patient.…”
Section: False Negative Predictionsmentioning
confidence: 99%
“…Though the model was partially open-sourced (Havtorn et al 2020;Maaløe et al 2019), our group did not pursue a technical understanding of explaining how the system worked because we were convinced that winning human trust for the AI decisions would originate and remain within the parameters of performance defined as speed and accuracy. Ultimately, the objectivity -the mathematical certaintieswere not just descriptions of functionality but conceptual rigidities that commanded respect by humiliating human subjectivity and uncertainty.…”
Section: Explainability or Performance?mentioning
confidence: 99%
“…The text output of the language model was then fed to a classifier that predicted whether a cardiac arrest was happening or not (Figure 3). The AI system was applied directly on the audio stream where the only processing made was a short-term Fourier transformation (Havtorn et al, 2020), hence no explicit feature selection was made.…”
Section: The Technology Usedmentioning
confidence: 99%
“…There is no explanation of how the ML makes its predictions. The company that developed the AI system has some of their work in the open domain (Maaløe et al, 2019;Havtorn et al, 2020). However, the exact details on the ML system used for this use case are not publicly available.…”
Section: The Technology Usedmentioning
confidence: 99%
See 1 more Smart Citation