Many real‐world data mining applications have to deal with unlabeled streaming data. They are unlabeled because the sheer volume of the stream makes it impractical to label a significant portion of the data. The data streams can evolve over time and these changes are called concept drifts. Concept drifts have different characteristics, which can be used to categorize them into different types. A trade‐off between performance and cost exists among many concept drift detection approaches. On the one hand, high accuracy detection approach usually requires labeled data, possibly involving high cost for labeling. On the other hand, a variety of methods have been devoted to the topic of concept drift detection with unlabeled data, but these approaches often are most suited for only a subset of the concept drift types. The objective of this survey is to present these methods, categorize them and give recommendations of usage based on their behaviors under different types of concept drift.
This article is categorized under:
Fundamental Concepts of Data and Knowledge > Data Concepts
Fundamental Concepts of Data and Knowledge > Key Design Issues in Data Mining
Explainable AI > Classification