2022
DOI: 10.48550/arxiv.2205.12228
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

When More Data Hurts: A Troubling Quirk in Developing Broad-Coverage Natural Language Understanding Systems

Abstract: In natural language understanding (NLU) production systems, users' evolving needs necessitate the addition of new features over time, indexed by new symbols added to the meaning representation space. This requires additional training data and results in ever-growing datasets. We present the first systematic investigation into this incremental symbol learning scenario. Our analyses reveal a troubling quirk in building (broad-coverage) NLU systems: as the training dataset grows, more data is needed to learn new … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Publication Types

Select...

Relationship

0
0

Authors

Journals

citations
Cited by 0 publications
references
References 24 publications
0
0
0
Order By: Relevance

No citations

Set email alert for when this publication receives citations?