2022
DOI: 10.1145/3494832
|View full text |Cite
|
Sign up to set email alerts
|

Scarcity of Labels in Non-Stationary Data Streams: A Survey

Abstract: In a dynamic stream there is an assumption that the underlying process generating the stream is non-stationary and that concepts within the stream will drift and change as the stream progresses. Concepts learned by a classification model are prone to change and non-adaptive models are likely to deteriorate and become ineffective over time. The challenge of recognising and reacting to change in a stream is compounded by the scarcity of labels problem. This refers to the very realistic si… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
8
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
2
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 21 publications
(8 citation statements)
references
References 191 publications
0
8
0
Order By: Relevance
“…Yet the work makes unrealistic assumptions about the label delay, e.g., fixed delay which is also known a priori; the authors point out the necessity for more research on generalizations thereof [18,20]. Drift detection in presence of verification latency is getting some attention in recent years [6,16]. Semi-supervised drift detection methods are rather prominent, which monitor the performance of the queried labeled in a specific task [12,24].…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Yet the work makes unrealistic assumptions about the label delay, e.g., fixed delay which is also known a priori; the authors point out the necessity for more research on generalizations thereof [18,20]. Drift detection in presence of verification latency is getting some attention in recent years [6,16]. Semi-supervised drift detection methods are rather prominent, which monitor the performance of the queried labeled in a specific task [12,24].…”
Section: Related Workmentioning
confidence: 99%
“…In potentially critical settings, there is an interest to continuously monitor the data stream and analyze data samples at the time they are recorded e.g., in order to detect failing machines or determine the current operating state of a machine in a factory setting. Data-driven online machine learning techniques can address such tasks, but require labeled data samples for task-specific training [6]. Realistic data streams are often nonstationary and the underlying data statistics might change over time, so-called concept drifts [9].…”
Section: Introductionmentioning
confidence: 99%
“…Furthermore, the ability to effectively adapt to change is required. According to C.Fahy et.al [10], one of the most difficult challenges in streaming data analysis is dealing with changes in a stream. Changes can be sudden or gradual, persistent, recurring, or transient.…”
Section: Introductionmentioning
confidence: 99%
“…When new features appear in the stream, the dimensionality of X i changes, resulting in feature evolution. If there is feature evolution, D t ̸ = D t+δ [10] A majority of existing algorithms ( [2], [7], [20], [21], [26], [37], [44], [47]) address two major issues related to streaming data: the "infinite length" of the data and the "concept drift" of the data. Unfortunately, feature drift and feature evolution have not received significant attention.…”
Section: Introductionmentioning
confidence: 99%
“…Many AL strategies discard unlabeled samples, albeit this might induce a loss of valuable information [1]. According to Fahy et al [4] so-called self-labeling (SL) is one possible future research direction for using the unlabeled samples. SL strategies based on majority clustering, label prediction, or more deep generative modeling have been proposed in the context of online learning [11,12].…”
Section: Introductionmentioning
confidence: 99%