2015 54th IEEE Conference on Decision and Control (CDC) 2015
DOI: 10.1109/cdc.2015.7403283
|View full text |Cite
|
Sign up to set email alerts
|

Whittle index policy for crawling ephemeral content

Abstract: We consider the task of scheduling a crawler to retrieve from several sites their ephemeral content. This is content, such as news or posts at social network groups, for which a user typically loses interest after some days or hours. Thus development of a timely crawling policy for ephemeral information sources is very important. We first formulate this problem as an optimal control problem with average reward. The reward can be measured in terms of the number of clicks or relevant search requests. The problem… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
27
0

Year Published

2015
2015
2021
2021

Publication Types

Select...
3
2
1

Relationship

1
5

Authors

Journals

citations
Cited by 15 publications
(27 citation statements)
references
References 28 publications
0
27
0
Order By: Relevance
“…More recent work has addressed indexability in POMDP models with real-state projects, including, e.g., multisite tracking (Le Ny et al (2008)), dynamic multichannel access (Liu and Zhao (2010) and Ouyang et al (2015)), continuous-time (Le Ny et al (2011)) and discrete-time (Dance and Silander (2015)) multitarget tracking, demand response (Taylor and Mathieu (2014)), web crawling (Avrachenkov and Borkar (2018)), and hidden Markov bandits (Meshram et al (2018)).…”
Section: Real-state Projects: Prevailing Approaches To Indexability Bmentioning
confidence: 99%
“…More recent work has addressed indexability in POMDP models with real-state projects, including, e.g., multisite tracking (Le Ny et al (2008)), dynamic multichannel access (Liu and Zhao (2010) and Ouyang et al (2015)), continuous-time (Le Ny et al (2011)) and discrete-time (Dance and Silander (2015)) multitarget tracking, demand response (Taylor and Mathieu (2014)), web crawling (Avrachenkov and Borkar (2018)), and hidden Markov bandits (Meshram et al (2018)).…”
Section: Real-state Projects: Prevailing Approaches To Indexability Bmentioning
confidence: 99%
“…where ν i (n) = 1 if location i is crawled at time n and 0 otherwise, subject to the constraint that only N 0 < N crawlers can be active at any time. This can be cast as a restless bandit problem that is Whittle indexable [2]. The Whittle index is given by [2]…”
Section: Problem Formulationmentioning
confidence: 99%
“…This work addresses a situation when the latter is known in a parametric form, but the parameters are unknown, and the algorithm is required to operate online with streaming real time data. As a test case, we consider the specific problem of scheduling web crawlers for ephemeral content analyzed in [2] (see also [3], [14] for related work). We describe this problem in the next section and summarize the main results of [2].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…It is solved by the vanishing discount approach [3,26]-by first considering a discounted reward system and then taking limits as the discount approaches to 1. Define…”
Section: Average Reward Problemmentioning
confidence: 99%