Proceedings of the 2017 ACM on Conference on Information and Knowledge Management 2017
DOI: 10.1145/3132847.3133016
|View full text |Cite
|
Sign up to set email alerts
|

Efficient Document Filtering Using Vector Space Topic Expansion and Pattern-Mining

Abstract: Automatically extracting information from social media is challenging given that social content is o en noisy, ambiguous, and inconsistent. However, as many stories break on social channels rst before being picked up by mainstream media, developing methods to be er handle social content is of utmost importance. In this paper, we propose a robust and e ective approach to automatically identify microposts related to a speci c topic de ned by a small sample of reference documents. Our framework extracts clusters … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
4
1
1

Relationship

1
5

Authors

Journals

citations
Cited by 8 publications
(4 citation statements)
references
References 30 publications
(51 reference statements)
0
4
0
Order By: Relevance
“…Our method of integrating logic rules into machine learning is related to the current trend of AI research moving from machine learning to neuro-symbolic methods, and from datato hybrid data -and knowledge-driven approaches. Those methods have shown to be more robust due to their capability in representing concepts and the causal relations among them, and have demonstrated their effectiveness for several tasks including health monitoring [5], document filtering [48], stock pricing [49]. Methodologically, there are mainly two approaches for integrating symbolic knowledge into neural networks.…”
Section: Neuro-symbolic Methodsmentioning
confidence: 99%
“…Our method of integrating logic rules into machine learning is related to the current trend of AI research moving from machine learning to neuro-symbolic methods, and from datato hybrid data -and knowledge-driven approaches. Those methods have shown to be more robust due to their capability in representing concepts and the causal relations among them, and have demonstrated their effectiveness for several tasks including health monitoring [5], document filtering [48], stock pricing [49]. Methodologically, there are mainly two approaches for integrating symbolic knowledge into neural networks.…”
Section: Neuro-symbolic Methodsmentioning
confidence: 99%
“…Instead of relying on keyword matching like cosine similarity, WMD attempts to find an optimal traveling cost between two documents in the word embedding space. WMD has been widely adopted in the literature to measure short text semantic similarity [24,34]. Here, we adopt WMD as the relevance measure for a pair of two documents.…”
Section: Data Preparationmentioning
confidence: 99%
“…Document filtering is the task to separate relevant documents from the irrelevant ones for a specific topic (Robertson and Soboroff, 2002;Nanas et al, 2010;Gao et al, , 2015Proskurnia et al, 2017). Both ranking and classification based solutions have been developed (Harman, 1994;Robertson and Soboroff, 2002;Soboroff and Robertson, 2003).…”
Section: Related Workmentioning
confidence: 99%
“…Frequent term patterns in terms of finegrained hidden topics are proposed in (Gao et al, , 2015 for doucment filtering. Very recently, frequent term patterns are also utilized to perform event-based microblog filtering (Proskurnia et al, 2017). However, these approaches are all based on supervised-learning, which requires a significant amount of positive documents for each topic.…”
Section: Related Workmentioning
confidence: 99%