In many decision making applications, users typically issue aggregate queries. To evaluate these computationally expensive queries, online aggregation has been developed to provide approximate answers (with their respective confidence intervals) quickly, and to continuously refine the answers. In this paper, we extend the online aggregation technique to a distributed context where sites are maintained in a DHT (Distributed Hash Table) network. Our Distributed Online Aggregation (DoA) scheme iteratively and progressively produces approximate aggregate answers as follows: in each iteration, a small set of random samples are retrieved from the data sites and distributed to the processing sites; at each processing site, a local aggregate is computed based on the allocated samples; at a coordinator site, these local aggregates are combined into a global aggregate. DoA adaptively grows the number of processing nodes as the sample size increases. To further reduce the sampling overhead, the samples are retained as a precomputed synopsis over the network to be used for processing future queries. We also study how these synopsis can be maintained incrementally. We have conducted extensive experiments on PlanetLab. The results show that our DoA scheme reduces the initial waiting time significantly and provides high quality approximate answers with running confidence intervals progressively.
We present the design, implementation, and evaluation of AirCloud -a novel client-cloud system for pervasive and personal air-quality monitoring at low cost. At the frontend, we create two types of Internet-connected particulate matter (PM 2.5 ) monitors -AQM and miniAQM, with carefully designed mechanical structures for optimal air-flow. On the cloud-side, we create an air-quality analytics engine that learn and create models of air-quality based on a fusion of sensor data. This engine is used to calibrate AQMs and miniAQMs in real-time, and infer PM 2.5 concentrations. We evaluate AirCloud using 5 months of data and 2 month of continuous deployment, and show that AirCloud is able to achieve good accuracies at much lower cost than previous solutions. We also show three real applications built on top of AirCloud by 3rd party developers to further demonstrate the value of our system.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.