Proceedings of the Third Workshop on Representation Learning for NLP 2018
DOI: 10.18653/v1/w18-3008
|View full text |Cite
|
Sign up to set email alerts
|

Comparison of Representations of Named Entities for Document Classification

Abstract: We explore representations for multitoken names in the context of the Reuters topic and sector classification tasks (RCV1). We find that: the best way to treat names is to split them into tokens and use each token as a separate feature; NEs have more impact on sector classification than on topic classification; replacing all NEs with special entity-type tokens is not an effective strategy; representing tokens by different embeddings for proper names vs. common nouns does not improve results. We highlight the i… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 12 publications
0
3
0
Order By: Relevance
“…The scalar δ i is equal to the average of D( x i s , x i t ) 2 over all data points. For rcv1(subset1) text classification dataset, we obtain the vectorial representation based on the word-embedding used by [30]. For MTS datasets, we compute each K i via employing the global alignment kernel (GAK) [31] for the i-th dimension of the time-series.…”
Section: Methodsmentioning
confidence: 99%
“…The scalar δ i is equal to the average of D( x i s , x i t ) 2 over all data points. For rcv1(subset1) text classification dataset, we obtain the vectorial representation based on the word-embedding used by [30]. For MTS datasets, we compute each K i via employing the global alignment kernel (GAK) [31] for the i-th dimension of the time-series.…”
Section: Methodsmentioning
confidence: 99%
“…Various attempts have been made to learn semantic representation from different aspects such as entities (Pivovarova and Yangarber, 2018) and abstract (Kim and Gil, 2019) etc. Our hypothesis is that entities can provide a knowledge backbone and abstract can provide an overall summary which should be modeled and learned interactively together.…”
Section: Semantic Information Learningmentioning
confidence: 99%
“…While they reported use of NEs improve the accuracy of document classification, the contribution to subword-based neural network models was not investigated. Pivovarova and Yangarber (2018) compared the representation of NEs for neural network-based models in document classification task. They reported that replacing tokens of named entities with special tokens representing NE categories does not improve the accuracy of document classification.…”
Section: Related Workmentioning
confidence: 99%