Overloaded data stream management systems (DSMS) cannot process all tuples within their response time. For some DSMS it is crucial to allocate the precious resources to process the most significant tuples. Prior work has applied shedding and spilling to permanently drop or temporarily place to disk insignificant tuples. However neither approach considers that tuple significance can be multi-tiered nor that significance determination can be costly. These approaches consider all tuples not dropped to be equally significant. Unlike these prior works, we take a fresh stance by pulling the most significant tuples forward throughout the query pipeline. Proactive Promotion (PP), a new DSMS methodology for preferential CPU resource allocation, selectively pulls the most significant tuples ahead of less significant tuples. Our optimizer produces an optimal PP plan that minimizes the processing latency of tuples in the most significant tiers in this multi-tiered precedence scheme by strategically placing significance determination operators throughout the query pipeline at compile-time and by agilely activating them at run-time. Our results substantiate that PP lowers the latency and increases the throughput for significant results when compared to the state-of-the-art shedding and traditional DSMS approaches (between 2 and 18 fold for a rich diversity of datasets) with negligible overhead.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.