The streaming data technologies play a vital role in real-time applications. To analyze the data, Random sampling with replacement has a problem in drawing inferences from the small random sample, while sampling without replacement is not preferable to sub-streams that correspond to different sources. Hence, to effectively mine the data streams from heterogeneous sources, this work proposes Adaptive Reservoir sampling Of stream In a Time window (AdROIT) which partitions the streams in a window on time factor and determines the size of historical data on reference window regarding the data changes in the observation window. By measuring the standard deviation of the partitioned window, we can identify whether the changes in statistical properties of a data stream is due to one or multiple sources. The AdROIT allocates the reservoir sampling size to the source, ensures the adaptability, updates the ensemble classifier with dynamically estimated weight, decides accuracy of each member regarding weight. The experimental results show that the AdROIT provides better classification and mining results over heterogeneous data streams. The AdROIT increases the precision by 16%, compared to the Chain sampling under a high degree of heterogeneity. Under the same scenario, the proposed scheme increases the recall by 30 %, more than that in Chain sampling. In high degree of heterogeneity, the Chain sampling utilizes 40kb for storage, more than that of Chain sampling. Finally, the high window size reduces the execution time in AdROIT by 15 seconds and improves the recall by 40%, compared to the Chain sampling.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.