Jicheng Shan scite author profile

In practical applications, data stream classification faces significant challenges, such as high cost of labeling instances and potential concept drifting. We present a new online active learning ensemble framework for drifting data streams based on a hybrid labeling strategy that includes the following: 1) an ensemble classifier, which consists of a long-term stable classifier and multiple dynamic classifiers (a multilevel sliding window model is used to create and update the dynamic classifiers to effectively process both the gradual drift type and sudden drift type data stream) and 2) active learning, which takes a nonfixed labeling budget, supports on-demand request labeling, and adopts an uncertainty strategy and random strategy to label instances. The decision threshold of the uncertainty strategy is adjusted dynamically, i.e., when concept drift occurs, the threshold is gradually reduced to query the most uncertain instances in priority to reduce the request expense as much as possible. Experiments on synthetic and real data sets show that precise prediction accuracy can be obtained by the proposed method without increasing the total cost of labeling, and that the labeling cost can be dynamically allocated according to the concept drift.

show abstract

Online Active Learning Paired Ensemble for Concept Drift and Class Imbalance

Zhang

Liu

Shan

et al. 2018

IEEE Access

View full text Add to dashboard Cite

Resample-Based Ensemble Framework for Drifting Imbalanced Data Streams

et al. 2019

View full text Add to dashboard Cite

Machine learning in real-world scenarios is often challenged by concept drift and class imbalance. This paper proposes a Resample-based Ensemble Framework for Drifting Imbalanced Stream (RE-DI). The ensemble framework consists of a long-term static classifier to handle gradual and multiple dynamic classifiers to handle sudden concept drift. The weights of the ensemble classifier are adjusted from two aspects. First, a time-decayed strategy decreases the weights of the dynamic classifiers to make the ensemble classifier focus more on the new concept of the data stream. Second, a novel reinforcement mechanism is proposed to increase the weights of the base classifiers that perform better on the minority class and decrease the weights of the classifiers that perform worse. A resampling buffer is used for storing the instances of the minority class to balance the imbalanced distribution over time. In our experiment, we compare the proposed method with other state-of-the-art algorithms on both real-world and synthetic data streams. The results show that the proposed method achieves the best performance in terms of both Prequential AUC and accuracy.INDEX TERMS Online ensemble learning, resample learning, reinforcement, concept drift, class imbalance.

show abstract

Determining Eight Biogenic Amines in Surface Water Using High-Performance Liquid Chromatography–Tandem Mass Spectrometry

Quan¹,

Xie²,

Pan³

et al. 2016

Pol. J. Environ. Stud.

View full text Add to dashboard Cite

Analysis of the Impact of Battlefield Environment on Military Operation Effectiveness Using Fuzzy Influence Diagram

Shan

Liu

2019

Int. J. Fuzzy Syst.

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Jicheng Shan

Online Active Learning Ensemble Framework for Drifted Data Streams

Online Active Learning Paired Ensemble for Concept Drift and Class Imbalance

Resample-Based Ensemble Framework for Drifting Imbalanced Data Streams

Determining Eight Biogenic Amines in Surface Water Using High-Performance Liquid Chromatography–Tandem Mass Spectrometry

Analysis of the Impact of Battlefield Environment on Military Operation Effectiveness Using Fuzzy Influence Diagram

Contact Info

Product

Resources

About