Proceedings of the 2007 SIAM International Conference on Data Mining 2007
DOI: 10.1137/1.9781611972771.52
|View full text |Cite
|
Sign up to set email alerts
|

A System for Keyword Search on Textual Streams

Abstract: An increasing amount of data is produced in the form of text streams − these can be RSS news feeds, TV closed captions, emails, etc. We study the problem of answering keyword queries on multiple textual streams. We define the result of a keyword query inspired by previous work on keyword search on static databases. A result to a query is a combination of streams "sufficiently correlated" to each other that collectively contain all query keywords within a specified time span. On the algorithmic side, in this pa… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2011
2011
2018
2018

Publication Types

Select...
3
1
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(1 citation statement)
references
References 15 publications
0
1
0
Order By: Relevance
“…Unfortunately, previous works on RSS/Atom statistical characteristics [15,27,13] do not provide a precise and updated characterization of feeds' behavior and content which could be effectively used for tuning refreshing policies of RSS aggregators [24,22], benchmarking scalability and performance of RSS continuous monitoring mechanisms [19,9,7,8,25,5] or comparing various techniques for RSS items mining, recommendation, enrichment and archiving [3,26]. In this paper, we present the first thorough analysis of three complementary features of real-scale RSS/Atom feeds, namely, publication activity, items structure and length, as well as, vocabulary of the textual content.…”
Section: Introductionmentioning
confidence: 99%
“…Unfortunately, previous works on RSS/Atom statistical characteristics [15,27,13] do not provide a precise and updated characterization of feeds' behavior and content which could be effectively used for tuning refreshing policies of RSS aggregators [24,22], benchmarking scalability and performance of RSS continuous monitoring mechanisms [19,9,7,8,25,5] or comparing various techniques for RSS items mining, recommendation, enrichment and archiving [3,26]. In this paper, we present the first thorough analysis of three complementary features of real-scale RSS/Atom feeds, namely, publication activity, items structure and length, as well as, vocabulary of the textual content.…”
Section: Introductionmentioning
confidence: 99%