Interspeech 2017 2017
DOI: 10.21437/interspeech.2017-1480
|View full text |Cite
|
Sign up to set email alerts
|

Label-Dependency Coding in Simple Recurrent Networks for Spoken Language Understanding

Abstract: Modeling target label dependencies is important for sequence labeling tasks. This may become crucial in the case of Spoken Language Understanding (SLU) applications, especially for the slot-filling task where models have to deal often with a high number of target labels. Conditional Random Fields (CRF) were previously considered as the most efficient algorithm in these conditions. More recently, different architectures of Recurrent Neural Networks (RNNs) have been proposed for the SLU slot-filling task. Most o… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
47
1
1

Year Published

2018
2018
2024
2024

Publication Types

Select...
4
1

Relationship

1
4

Authors

Journals

citations
Cited by 11 publications
(50 citation statements)
references
References 22 publications
1
47
1
1
Order By: Relevance
“…For example, the missing words in the pre-trained French word embedding adversely affected the F1 scores for MEDIA. The approach can be easily adapted to a variety of different network architectures (e.g., (Dinarelli et al, 2017)) and word embeddings (e.g., (Reimers and Gurevych, 2017a)). Future studies will focus on how to choose a good set of concepts for the PC priming strategy.…”
Section: Discussionmentioning
confidence: 99%
“…For example, the missing words in the pre-trained French word embedding adversely affected the F1 scores for MEDIA. The approach can be easily adapted to a variety of different network architectures (e.g., (Dinarelli et al, 2017)) and word embeddings (e.g., (Reimers and Gurevych, 2017a)). Future studies will focus on how to choose a good set of concepts for the PC priming strategy.…”
Section: Discussionmentioning
confidence: 99%
“…For both ATIS and MEDIA, entities are used as the utterance input. In contrast to [3], no context windows were used as part of the inputs in our models. Instead, contextual information has been exploited at different stages by our models, as described in Section 2.…”
Section: Datasetsmentioning
confidence: 99%
“…Note again that our word and label embeddings have 200 dimensions in both ATIS and MEDIA, while [3] used 100 and 200 dimensions for ATIS and ME-DIA, respectively. Even with much fewer dimensions, the Jordan network based model in [3] still requires more than 1.7 million parameters, while, in comparison, our model needs only 682,000 parameters and can achieve comparable performance. There are at least two reasons why our approach requires far fewer parameters.…”
Section: Blmentioning
confidence: 99%
See 2 more Smart Citations