There are many ways to build a predictive model from data. Besides the numerous classification or regression algorithms to choose from, there are countless possibilities of useful data transformation prior to modeling. To assist in discovering good predictive analytics workflows, we introduced recently a collaborative analytics system that allows workflow sharing and reuse. We designed a recommendation engine for the system to enable matching of analytics needs with relevant workflows stored in repository. The engine relies on meta-predictive modeling of traffic-analysis workflow-characteristics. In this paper, we present a feasibility study of applying this collaborative analytics system to predict traffic congestion. Different ways to build predictive models from traffic dataset are pooled as shared workflows. We demonstrate that through dynamic recommendation of workflows that are suitable for the real-time varying traffic data, a reliable congestion prediction can be achieved. The promising results showcase that systematic collaboration among data scientists made possible by our system can be a powerful tool to produce very accurate prediction from data.
This paper reports on the development of the Cloud Oriented Data Analytics (CODA) framework which has functions for composing, managing, and processing workflows for data analytics in cloud computing. The framework provides a number of reusable software components for data analytics to users which can be composed as workflows through well-known workflow composers, e.g., RapidMiner, Taverna, and JOpera. In particular, workflow scheduling, workflow recommendation, resource provisioning, resource monitoring, data locality, and security for the workflow computation are addressed by the framework. By using the framework, we demonstrate that workflows can be easily composed and processed in cloud computing. By coordinating the submitted workflows, we can obtain a significant improvement in performance.
This paper proposes a multi-tenant workflow framework that allows users to create data analytic workflows whose tasks are efficiently scheduled and distributed in cloud computing environment. We provide a demo of an event room assignment (ERA) as a test application of the framework. The ERA dynamically and automatically assigns registered events (e.g., meetings, classes, conferences, etc.) to available rooms meeting the user requirements such as the event size, purpose, reservation period, etc. The assignment will lead to the energy efficiency with respect to the power usage (e.g., lighting, ventilation, devices, etc.), and the energy savings can be achieved without affecting people's comfort. We run the ERA with power consumption data (whose size is approximately 50GB) collected from each of over 200 rooms in a building at Dept. of Engineering, Tokyo University. Through the demonstration, we will show that the proposed framework accelerates the speed of data analysis by providing userfriendly workflow composition and parallel processing features utilizing cloud computing technologies.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.