<p><b>The client-centric multi-cloud has become a popular cloud ecosystem because it allows enterprise users to share the workload across multiple cloud providers to achieve high-quality services with lower operation costs and higher application resilience. From the perspective of application providers, the location of cloud resources for application deployment significantly impacts the deployment cost and performance of applications, e.g., request response time. This gives rise to the problem of location-aware application deployment in multi-cloud to select suitable cloud resources from widely distributed multi-cloud data centers to balance the cost and performance. Existing research works did not pay full attention to the key impact of the location for application deployment. Therefore, it is urgent to study the problem both theoretically and in practice. In this thesis, innovative optimization methods and machine learning techniques are proposed for three common scenarios, namely composite application deployment, application replication and deployment, and elastic application deployment.</b></p>
<p>First, this thesis studies the composite application deployment problem with the goal to minimize the average response time of composite applications subject to a budget constraint. We propose a Hybrid Genetic Algorithm (GA)-based approach, i.e., H-GA, for solving the problem with an extremely large search space. H-GA features a newly designed and domain-tailored service clustering algorithm, repair algorithm, solution representation, population initialization, and genetic operators. Experiments show that H-GA can outperform significantly several state-of-the-art approaches, achieving up to about 8% performance improvement in terms of response time, and 100% budget satisfaction in the meantime.</p>
<p>Second, this thesis studies the application replication and deployment problem with the goal to minimize the total deployment cost of all application replicas subject to a stringent requirement on average response time. We propose two approaches under different optimization frameworks to solve the problem. With user requests dispatched to the closest application replicas, we develop an approach under a GA framework for Application Replication and Deployment (ARD), i.e., GA-ARD. GAARD features problem-specific solution representation, fitness measurement, and population initialization, which are effective to optimize the deployment of application replicas in multi-cloud. The experiments show that GA-ARD outperforms common application replication and placement strategies in the industry. With user requests flexibly dispatched among different application replicas, we develop another approach under a two-stage optimization framework, i.e., MCApp. MCApp can optimize both replica deployment and request dispatching by combining mixed-integer linear programming with domain-tailored large neighborhood search. Our experiments show that MCApp can achieve up to 25% reduction in total deployment cost compared with several recently developed approaches.</p>
<p>Third, this thesis studies the elastic application deployment problem to minimize the deployment cost over a time span such as a billing day while satisfying the constraint on average response time. The goal of adapting resources for application deployment in response to dynamic and distributed workloads motivates us to adopt deep reinforcement learning (DRL) techniques. The proposed approach, namely DeepScale, applies a deep Q-network (DQN) to capture the optimal scaling policy that can perform online resource scaling. DeepScale also includes a long short-term memory-based prediction model to allow the DQN to consider predicted future requests while making cost-effective scaling decisions. Besides, we design a penalty-based reward function and a safety-aware action executor to ensure that any scaling decisions made by DRL can satisfy the response time constraint. The experiments show that DeepScale can significantly reduce the deployment cost of applications compared with the state-of-the-art baselines, including Amazon auto-scaling service and recently proposed RL-based algorithms. In the meanwhile, DeepScale can effectively satisfy the constraint on the average response time.</p>
<p>In summary, this thesis studies three new problems for location-aware application deployment in multi-cloud. We propose four novel approaches under different optimization and machine learning frameworks, i.e., HGA, GA-ARD, MCApp, and DeepScale, for solving these problems. New constraint handling techniques are developed to satisfy the practical deployment requirements of enterprise applications.</p>