Sequential reservoir sampling with a nonuniform distribution

Kolonko, Michael; Wäsch, D.

doi:10.1145/1141885.1141891

Cited by 17 publications

(21 citation statements)

References 3 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Ideally, the sample buffers should keep a balance between sample diversity and adaptability. Motivated by this, reservoir sampling [18][19][20][21] is proposed for sequential random sampling. In principle, it aims to randomly draw some samples from a large population of samples that come in a sequential manner.…”

Section: Time-weighted Reservoir Samplingmentioning

confidence: 99%

“…Therefore, larger weights should be assigned to the recently added samples while smaller weights should be attached with the old samples. Inspired by [20,21], we design a time-weighted reservoir sampling (TWRS) method for randomly drawing the samples according to their time-varying properties, as listed in Algorithm 3. The designed TWRS method is capable of effectively maintaining the sample buffers for online metric learning in Sec.…”

Section: Time-weighted Reservoir Samplingmentioning

confidence: 99%

“…To allow for real-time applications, we design a timeweighted reservoir sampling method to maintain and update limited-sized sample buffers for balancing between sample diversity and adaptability in the metric learning procedure. With the theory of [20,21], larger weights are assigned to those recently received samples, which is particularly important for tracking. To our knowledge, it is the first time that reservoir sampling is used in an online metric learning setting that is tailored for robust visual tracking.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Non-sparse linear representations for visual tracking with online reservoir metric learning

Shen

Shi

et al. 2012

2012 IEEE Conference on Computer Vision and Pattern Recognition

View full text Add to dashboard Cite

Section: Time-weighted Reservoir Samplingmentioning

confidence: 99%

Section: Time-weighted Reservoir Samplingmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Non-sparse linear representations for visual tracking with online reservoir metric learning

Shen

Shi

et al. 2012

2012 IEEE Conference on Computer Vision and Pattern Recognition

View full text Add to dashboard Cite

“…Although we can directly sampling the weighted random sample R from D, but because the needs of sample merge which will be discussed later, we exploit a weighted random sampling method on data stream (called WRS, meaning Weighted Reservoir Sampling, refer to [12]) to obtain R.…”

Section: Representation Of Data Nodesmentioning

confidence: 99%

Weighted Random sampling based hierarchical amnesic synopses for data streams

Chen

Kang-Li

2010

2010 5th International Conference on Computer Science &Amp; Education

View full text Add to dashboard Cite

Maintaining a synopsis structure dynamically from data stream is vital for a variety of streaming data applications, such as approximate query or data mining. In many cases, the significance of data item in streams decays with age: this item perhaps conveys critical information first, but, as time goes by, it gets less and less important until it eventually becomes useless. This characteristic is termed amnesic. Random Sampling is often used in construction of synopsis for streaming data. This paper proposed a Weighted Random Sampling based Hierarchical Amnesic Synopses which includes the amnesic characteristic of data stream in the generation of synopsis. The construction methods for weighted random sampling with and without replacement are discussed. We experimentally evaluate the proposed synopsis structure.

show abstract

“…Sampling is a very natural way to summarize data properties with sublinear space; indeed, it is a key component of many streaming algorithms and techniques. Just to mention a few, the relevant papers include Aggarwal [ [10]; BarYossef [13]; Bar-Yossef, Kumar and Sivakumar [17]; Buriol, Frahling, Leonardi, Marchetti-Spaccamela and Sohler [20]; Chakrabarti, Cormode and McGregor [21]; Chaudhuri and Mishra [26]; Chaudhuri, Motwani and Narasayya [27]; Cohen [29]; Cohen and Kaplan [30]; Cormode, Muthukrishnan and Rozenbaum [32]; Dasgupta, Drineas, Harb, Kumar and Mahoney [35]; Datar and Muthukrishnan [37]; Duffield, Lund and Thorup [38]; Frahling, Indyk and Sohler [43]; Gandhi, Suri and Welzl [46]; Gemulla [47]; Gemulla and Lehner [48]; Gibbons and Matias [49]; Guha, Meyerson, Mishra, Motwani and O'Callaghan [54]; Haas [55]; Kolonko and Wäsch [58]; Li [62]; Palmer and Faloutsos [67]; Szegedy [70]; and Vitter [72]; These papers illustrate the vitality of effective sampling methods for data streams. Among other methods, uniform random sampling is the most general and well-understood.…”

Section: Introductionmentioning

confidence: 99%

Optimal sampling from sliding windows

Braverman

Ostrovsky

Zaniolo

2009

Proceedings of the Twenty-Eighth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems

View full text Add to dashboard Cite

APPEARED IN ACM PODS-2009. A sliding windows model is an important case of the streaming model, where only the most "recent" elements remain active and the rest are discarded in a stream. The sliding windows model is important for many applications (see, e.g., Babcock, Babu, Datar, Motwani and Widom (PODS 02); and Datar, Gionis, Indyk and Motwani (SODA 02)). There are two equally important types of the sliding windows model -windows with fixed size, (e.g., where items arrive one at a time, and only the most recent n items remain active for some fixed parameter n), and bursty windows (e.g., where many items can arrive in "bursts" at a single step and where only items from the last t steps remain active, again for some fixed parameter t).Random sampling is a fundamental tool for data streams, as numerous algorithms operate on the sampled data instead of on the entire stream. Effective sampling from sliding windows is a nontrivial problem, as elements eventually expire. In fact, the deletions are implicit; i.e., it is not possible to identify deleted elements without storing the entire window. The implicit nature of deletions on sliding windows does not allow the existing methods (even those that support explicit deletions, e.g., Cormode, Muthukrishnan and Rozenbaum (VLDB 05); Frahling, Indyk and Sohler (SOCG 05)) to be directly "translated" to the sliding windows model. One trivial approach to overcoming the problem of implicit deletions is that of over-sampling. When k samples are required, the over-sampling method maintains k > k samples in the hope that at least k samples are not expired. The obvious disadvantages of this method are twofold: (a) It introduces additional costs and thus decreases the performance; and (b) The memory bounds are not deterministic, which is atypical for * Supported in part by NSF grant 0830803

show abstract

Sequential reservoir sampling with a nonuniform distribution

Cited by 17 publications

References 3 publications

Non-sparse linear representations for visual tracking with online reservoir metric learning

Non-sparse linear representations for visual tracking with online reservoir metric learning

Weighted Random sampling based hierarchical amnesic synopses for data streams

Optimal sampling from sliding windows

Contact Info

Product

Resources

About