In-memory cluster computing platforms have gained momentum in the last years, due to their ability to analyse big amounts of data in parallel. These platforms are complex and difficult-to-manage environments. In addition, there is a lack of tools to better understand and optimize such platforms that consequently form backbone of big data infrastructure and technologies. This directly leads to underutilization of available resources and application failures in such environment. One of the key aspects that can address this problem is optimization of the task parallelism of application in such environments. In this paper, we propose a machine learning based method that recommends optimal parameters for task parallelization in big data workloads. By monitoring and gathering metrics at system and application level, we are able to find statistical correlations that allow us to characterize and predict the effect of different parallelism settings on performance. These predictions are used to recommend an optimal configuration to users before launching their workloads in the cluster, avoiding possible failures, performance degradation and wastage of resources. We evaluate our method with a benchmark of 15 Spark applications on the Grid5000 testbed. We observe up to a 51% gain on performance when using the recommended parallelism settings. The model is also interpretable and can give insights to the user into how different metrics and parameters affect the performance.
The growth in the number of cloud computing users has led to the availability of a variety of cloud based services provided by different vendors. This has made the task of selecting a suitable set of services quite difficult. There has been a lot of research towards the development of suitable decision support system (DSS) to assist users in making an optimal selection of cloud services. However, existing decision support systems cannot address two crucial issues: firstly, the involvement of both business and technical perspectives in decision making simultaneously and, secondly, the multiple-clouds services based selection using a single DSS. In this paper, we tackle these issues in the light of solving the problem of cloud service discovery. In particular, we present the following novel contributions: Firstly, we present a critical analysis of the state-of-the-art in decision support systems. Based on our analysis, we identify critical shortcomings in the existent tools and develop the set of requirements which should be met by a potential DSS. Secondly, we present a new holistic framework for the development of DSS which allows a pragmatic description of user requirements. Additionally, the data gathering and analysis is studied as an integral part of the proposed DSS and therefore, we present concrete algorithms to assess the data for an optimal service discovery. Thirdly, we assess our framework for applicability to cloud service selection using an industrial case study. We also demonstrate the implementation and performance of our proposed framework using a prototype which serves as a proof of concept. Overall, this paper provides a novel and holistic framework for development of a multiple cloud service discovery based decision support system. 2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing 978-1-4799-8006-2/15 $31.00
Industry in all sectors is experiencing a profound digital transformation that puts software at the core of their businesses. In order to react to continuously changing user requirements and dynamic markets, companies need to build robust workflows that allow them to increase their agility in order to remain competitive. This increasingly rapid transformation, especially in domains like IoT or Cloud computing, poses significant challenges to guarantee high quality software, since dynamism and agile short-term planning reduce the ability to detect and manage risks. In this paper, we describe the main challenges related to managing risk in agile software development, building on the experience of more than 20 agile coaches operating continuously for 15 years with hundreds of teams in industries in all sectors. We also propose a framework to manage risks that considers those challenges and supports collaboration, agility, and continuous development. An implementation of that framework is then described in a tool that handles risks and mitigation actions associated with the development of multi-cloud applications. The methodology and the tool have been validated by a team of evaluators that were asked to consider its use in developing an urban smart mobility service and an airline flight scheduling system.
This paper presents NANCY, a system that generates adaptive bit rates (ABR) for video and adaptive network coding rates (ANCR) using reinforcement learning (RL) for video distribution over wireless networks. NANCY trains a neural network model with rewards formulated as quality of experience (QoE) metrics. It performs joint optimization in order to select: (i) adaptive bit rates for future video chunks to counter variations in available bandwidth and (ii) adaptive network coding rates to encode the video chunk slices to counter packet losses in wireless networks. We present the design and implementation of NANCY, and evaluate its performance compared to state-of-the-art video rate adaptation algorithms including Pensieve and robustMPC. Our results show that NANCY provides 29.91% and 60.34% higher average QoE than Pensieve and robustMPC, respectively.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.