Proceedings of the Fourth ACM International Conference on Web Search and Data Mining 2011
DOI: 10.1145/1935826.1935918
|View full text |Cite
|
Sign up to set email alerts
|

Scalable clustering of news search results

Abstract: In this paper, we present a system for clustering the search results of a news search engine. The news search interface includes the relevant news articles to a given query organized in terms of related news stories. Here each cluster corresponds to a news story and the news articles are clustered into stories. We present a system that clusters the search results of a news search system in a fast and scalable manner. The clustering system is organized into three components including offline clustering, increme… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
15
0
1

Year Published

2012
2012
2018
2018

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 24 publications
(16 citation statements)
references
References 17 publications
0
15
0
1
Order By: Relevance
“…A broad array of methods rely on clustering algorithms to detect topics in documents. Vadrevu et al [2] propose an incremental version of k-means clustering algorithm, suited for dynamic sets of documents by incrementally reviewing already detected topics after a given number of new documents is introduced into the documents set. Forsati et al [3] propose a clustering based approach to content analysis utilized in a recommender system.…”
Section: Topic Detectionmentioning
confidence: 99%
“…A broad array of methods rely on clustering algorithms to detect topics in documents. Vadrevu et al [2] propose an incremental version of k-means clustering algorithm, suited for dynamic sets of documents by incrementally reviewing already detected topics after a given number of new documents is introduced into the documents set. Forsati et al [3] propose a clustering based approach to content analysis utilized in a recommender system.…”
Section: Topic Detectionmentioning
confidence: 99%
“…While there are many news clustering techniques, all were inappropriate for our analysis: kmeans clustering on a word vector space and Latent Dirichlet Allocation, used both in automated news clustering approaches [9,6,30] and studies of blogs [18], require a specific number of topics to be selected a priori, but we had no basis for choosing such a number. These techniques also ignore time: stories in the 1980s should not be clustered with problems from the 2000s just because they share a topic.…”
Section: Clustering Articles Into Storiesmentioning
confidence: 99%
“…A considerable research in topic detection and tracking has been focusing on news stories and there are multiple methods of detecting a news story, that yield different representations of topics. Vadrevu et al [1] propose an incremental version of k-means clustering algorithm, suited for dynamic sets of documents by incrementally reviewing already detected topics after a given number of new documents is introduced into the documents set.…”
Section: Related Workmentioning
confidence: 99%