Search citation statements
Paper Sections
Citation Types
Year Published
Publication Types
Relationship
Authors
Journals
Learning from changing data streams is one of the important tasks of data mining. The phenomenon of the underlying distribution of data streams changing over time is called concept drift. In classification decision-making, the occurrence of concept drift will greatly affect the classification efficiency of the original classifier, that is, the old decision-making model is not suitable for the new data environment. Therefore, dealing with concept drift from changing data streams is crucial to guarantee classifier performance. Currently, most concept drift detection methods apply the same detection strategy to different data streams, with little attention to the uniqueness of each data stream. This limits the adaptability of drift detectors to different environments. In our research, we designed a unique solution to address this issue. First, we proposed a variance estimation strategy and a variance feedback strategy to characterize the data stream’s characteristics through variance. Based on this variance, we developed personalized drift detection schemes for different data streams, thereby enhancing the adaptability of drift detection in various environments. We conducted experiments on data streams with various types of drifts. The experimental results show that our algorithm achieves the best average ranking for accuracy on the synthetic dataset, with an overall ranking 1.12 to 1.5 higher than the next-best algorithm. In comparison with algorithms using the same tests, our method improves the ranking by 3 to 3.5 for the Hoeffding test and by 1.12 to 2.25 for the McDiarmid test. In addition, they achieve a good balance between detection delay and false positive rates. Finally, our algorithm ranks higher than existing drift detection methods across the four key metrics of accuracy, CPU time, false positives, and detection delay, meeting our expectations.
Learning from changing data streams is one of the important tasks of data mining. The phenomenon of the underlying distribution of data streams changing over time is called concept drift. In classification decision-making, the occurrence of concept drift will greatly affect the classification efficiency of the original classifier, that is, the old decision-making model is not suitable for the new data environment. Therefore, dealing with concept drift from changing data streams is crucial to guarantee classifier performance. Currently, most concept drift detection methods apply the same detection strategy to different data streams, with little attention to the uniqueness of each data stream. This limits the adaptability of drift detectors to different environments. In our research, we designed a unique solution to address this issue. First, we proposed a variance estimation strategy and a variance feedback strategy to characterize the data stream’s characteristics through variance. Based on this variance, we developed personalized drift detection schemes for different data streams, thereby enhancing the adaptability of drift detection in various environments. We conducted experiments on data streams with various types of drifts. The experimental results show that our algorithm achieves the best average ranking for accuracy on the synthetic dataset, with an overall ranking 1.12 to 1.5 higher than the next-best algorithm. In comparison with algorithms using the same tests, our method improves the ranking by 3 to 3.5 for the Hoeffding test and by 1.12 to 2.25 for the McDiarmid test. In addition, they achieve a good balance between detection delay and false positive rates. Finally, our algorithm ranks higher than existing drift detection methods across the four key metrics of accuracy, CPU time, false positives, and detection delay, meeting our expectations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.