Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2018
DOI: 10.18653/v1/p18-1214
|View full text |Cite
|
Sign up to set email alerts
|

A Deep Relevance Model for Zero-Shot Document Filtering

Abstract: In the era of big data, focused analysis for diverse topics with a short response time becomes an urgent demand. As a fundamental task, information filtering therefore becomes a critical necessity. In this paper, we propose a novel deep relevance model for zero-shot document filtering, named DAZER. DAZER estimates the relevance between a document and a category by taking a small set of seed words relevant to the category. With pre-trained word embeddings from a large external corpus, DAZER is devised to extrac… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
9
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
4
2
1

Relationship

1
6

Authors

Journals

citations
Cited by 15 publications
(9 citation statements)
references
References 31 publications
0
9
0
Order By: Relevance
“…As to future work, we plan to transfer our model to other tasks such as reading comprehension, filtering, and summarization [22,33,35]. Also, we would like to apply reinforcement learning to improve the performance of dialogue generation.…”
Section: Discussionmentioning
confidence: 99%
“…As to future work, we plan to transfer our model to other tasks such as reading comprehension, filtering, and summarization [22,33,35]. Also, we would like to apply reinforcement learning to improve the performance of dialogue generation.…”
Section: Discussionmentioning
confidence: 99%
“…To reduce this gap, Sachan et al (2018) proposed two methods, namely keyword anonymisation and adaptive word dropout to regularise the model and make it rely less on the keywords. Similarly, Li et al (2018b) performed adversarial training with Gradient Reversal Layer (Ganin et al 2016) to remove category-specific information and to make the model generalise to unseen categories. Mudinas et al (2018) proposed a novel unsupervised method to bootstrap domain-specific sentiment classifiers.…”
Section: Domain Adaptationmentioning
confidence: 99%
“…While keyword (or keyphrase) extraction from text has been extensively studied, how the selection of keywords impacts dataless classification was rarely if ever discussed. Previous work used either hand-picked keywords (Druck et al 2008;Settles 2011;Meng et al 2018) or relied on only the category name or category description (Chang et al 2008;Li et al 2016;Li et al 2018b). The problem of extracting keywords from a (noisily) labelled corpus is defined formally as follows.…”
Section: Mining Abstract From (Noisy) Labelled Corpusmentioning
confidence: 99%
“…To the best of our knowledge, only one deep learning-based approach was proposed to address a problem similar to dataless text classification [18]. In this recent work, Li et al devised a deep relevance model for zero-shot document filtering -which consists at test time in predicting the relevance of documents with respect to a category unseen in the training set, where each category is characterized by a set of seed words.…”
Section: Related Workmentioning
confidence: 99%