As online social networks play a more and more important role in public opinion, the large-scale simulation of social networks has been focused on by many scientists from sociology, communication, informatics, and so on. It is a good way to study real information diffusion in a symmetrical simulation world by agent-based modeling and simulation (ABMS), which is considered an effective solution by scholars from computational sociology. However, on the one hand, classical ABMS tools such as NetLogo cannot support the simulation of more than thousands of agents. On the other hand, big data platforms such as Hadoop and Spark used to study big datasets do not provide optimization for the simulation of large-scale social networks. A two-tier partition algorithm for the optimization of large-scale simulation of social networks is proposed in this paper. First, the simulation kernel of ABMS for information diffusion is implemented based on the Spark platform. Both the data structure and the scheduling mechanism are implemented by Resilient Distributed Data (RDD) to simulate the millions of agents. Second, a two-tier partition algorithm is implemented by community detection and graph cut. Community detection is used to find the partition of high interactions in the social network. A graph cut is used to achieve the goal of load balance. Finally, with the support of the dataset recorded from Twitter, a series of experiments are used to testify the performance of the two-tier partition algorithm in both the communication cost and load balance.
Cloud Computing has emerged as a powerful and promising way for running high performance computing (HPC) jobs. Most HPC jobs are designed under multi-processes paradigm and involve frequent communication and synchronization among parallel processes. However, as the underlying resources of cloud data centers are always shared among multiple tenants, the competition of jobs for limited bandwidth resources lead to unpredictable completion times for jobs in the cloud, which may lead to QoS violation and inefficient utilization of resources when scheduling parallel jobs in the cloud. To tackle the issue, it is essential to provide bandwidth guarantees for parallel jobs running in the cloud. Offering a dedicated virtual cluster (VC) for running applications in the cloud is a popular way to guarantee bandwidth demands. Motivated by these problems, in this paper, we firstly design a time-aware virtual cluster (TVC) request model for parallel jobs and consider how to embed requested TVCs of jobs into cloud efficiently under parallel job scheduling framework. An adaptive bandwidth-aware heuristic algorithm, which is denoted as AdaBa, is proposed to improve the job accept rate by adjusting the priorities of servers to accommodate the VMs of TVC adaptively according to the relative size of requested bandwidth demand. Then, a bandwidth-guaranteed migration and backfilling scheduling algorithm, which is denoted as BgMBF, is designed to schedule parallel jobs and the bandwidth demands are guaranteed by AdaBa. To obtain high job responsiveness performance, a bandwidth-reserved job backfilling strategy is designed when the requested TVC for current scheduled job cannot be allocated in the cloud. The migration cost of BgMBF is also considered and an enhanced version BgMBFSDF is then proposed to minimize the number of migration when the execution time of jobs are known. Through extensive simulation experiments on popular parallel workloads, our proposed TVC embedding algorithm AdaBa achieves up to 15 percent of improvement on accept rate compared with existing algorithms such as Oktupus and greedy algorithm. Our proposed BgMBF and BgMBFSDF also significantly outperform other popular scheduling algorithms integrated with AdaBa on average response time and average bounded slow down.
Cloud computing is attracting an increasing number of simulation applications running in the virtualized cloud data center. These applications are submitted to the cloud in the form of simulation jobs. Meanwhile, the management and scheduling of simulation jobs are playing an essential role to offer efficient and high productivity computational service. In this paper, we design a management and scheduling service framework for simulation jobs in two-tier virtualization-based private cloud data center, named simulation execution as a service (SimEaaS). It aims at releasing users from complex simulation running settings, while guaranteeing the QoS requirements adaptively. Furthermore, a novel job scheduling algorithm named adaptive deadline-aware job size adjustment (ADaSA) algorithm is designed to realize high job responsiveness under QoS requirement for SimEaaS. ADaSA tries to make full use of the idle fragmentation resources by tuning the number of requested processes of submitted jobs in the queue adaptively, while guaranteeing that jobs’ deadline requirements are not violated. Extensive experiments with trace-driven simulation are conducted to evaluate the performance of our ADaSA. The results show that ADaSA outperforms both cloud-based job scheduling algorithm KCEASY and traditional EASY in terms of response time (up to 90%) and bounded slow down (up to 95%), while obtains approximately equivalent deadline-missed rate. ADaSA also outperforms two representative moldable scheduling algorithms in terms of deadline-missed rate (up to 60%).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.