2018 IEEE Spoken Language Technology Workshop (SLT) 2018
DOI: 10.1109/slt.2018.8639513
|View full text |Cite
|
Sign up to set email alerts
|

End-To-End Named Entity And Semantic Concept Extraction From Speech

Abstract: Named entity recognition (NER) is among SLU tasks that usually extract semantic information from textual documents. Until now, NER from speech is made through a pipeline process that consists in processing first an automatic speech recognition (ASR) on the audio and then processing a NER on the ASR outputs. Such approach has some disadvantages (error propagation, metric to tune ASR systems sub-optimal in regards to the final task, reduced space search at the ASR output level,...) and it is known that more inte… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

5
89
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 84 publications
(98 citation statements)
references
References 18 publications
5
89
0
Order By: Relevance
“…We evaluate both ASR (Word Error Rate) and SLU (Concept Error Rate) results on MEDIA corpus (Dev and Test). State-of-the-art Models E2E SLU [4] 300h 30.1 27.0 E2E Baseline [3] 41.5h -39.8 E2E SLU [3] 500h -23.7 E2E SLU + curr. [3] 500h -16.4 ASR results are presented in Table 2.…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…We evaluate both ASR (Word Error Rate) and SLU (Concept Error Rate) results on MEDIA corpus (Dev and Test). State-of-the-art Models E2E SLU [4] 300h 30.1 27.0 E2E Baseline [3] 41.5h -39.8 E2E SLU [3] 500h -23.7 E2E SLU + curr. [3] 500h -16.4 ASR results are presented in Table 2.…”
Section: Resultsmentioning
confidence: 99%
“…4 SLU performances are given in Table 3. Our results can be compared with some previous works [4,3]. We note however that results reported in [4,3] are obtained with models trained with much more data exploiting NER tasks with transfer learning.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…Nowadays there is a growing research interest in end-to-end systems for various SLU tasks [23][24][25][26][27][28][29][30][31]. In this work, similarly to [26,29], end-to-end training of signal-to-concept models is performed through the recurrent neural network (RNN) architecture and the connectionist temporal classification (CTC) loss function [32] as shown in Figure 1. A spectrogram of power normalized audio clips calculated on 20ms windows is used as the input features for the system.…”
Section: End-to-end Signal-to-concept Neural Architecturementioning
confidence: 99%
“…First, we integrated dialog history into this system based on dialog history embedding vectors (h-vectors) as shown in Figure 1 and proposed in Section 3. Second, in this paper, the task is SF, therefore the output sequence besides the alphabetic characters also contains special characters corresponding to the semantic tags [26,29].…”
Section: Signal-to-concept Modelsmentioning
confidence: 99%