Computing resources are now ubiquitous and computational research techniques permeate all disciplines. However, exploiting available resources can be a much more complicated proposition. There is no guarantee that one can simply use a compute resource with no more effort than copying binaries and data. As computing resources are usually heterogeneous in both hardware and software configurations, many requirements be matched to execute a computation on a new resource in a new environment. The difficulties increase when dealing with parallel computations, which add a layer of dependencies related to the Message Passing Interface (MPI) standard libraries. Unfortunately, managing the migration process by using existing techniques is inadequate or requires a non-trivial amount of effort and experience. In particular, schedulers are not generally designed to capture a computation's software-related requirements and, thus, depend on users to configure such dependencies. Additionally, the set of possible sites where computations could be scheduled is limited to where the computations are known to be able to run-a determination that in the current state of the art is performed manually by the user. This process, which requires enumerating dependencies, checking and making them available in new environments, and potentially recompiling the computation, can take many hours of labor. The difficulty is compounded by the fact that many researchers in disciplines that were previously not traditionally compute-heavy may not have experience with configuring a single environment, let alone with migrating a computation from one environment to another. An ideal solution for providing deployment and, therefore, scheduling freedom would allow any computation to quickly and easily be run on computing resources with tuned performance. Before addressing the difficult but secondary issues of automatic recompilation and tuning, the first everyone who was part of the process. In particular: To my advisor Andrew Grimshaw: thank you for all your invaluable teaching and guidance. From teaching me parallel programming and grid computing to guiding me toward becoming a researcher and professional, it's been a pleasure working with you. To my husband Dan: thank you for your patience, encouragement, understanding, and love. To my parents, Beata Sarnowska and Krzysztof Sarnowski and grandparents, Barbara and Adam Habdas: thank you for your unconditional love, guidance, and support and for always encouraging me to push on. To John Knight, Paul Reynolds, Malathi Veeraraghavan, and Mitch Rosen: thank you for your mentoring. To the Grimshaw research group and UVACSE team: thank you for your constructive criticisms and support. To all the many friends I made in Charlottesville: thank you for helping to make my time in graduate school so enjoyable. A special thanks to Tom Tracy and Chih-hao Shen for always being willing to lend a helping hand. I greatly appreciate everyone's support. Thank you all again! vi
As computing resources have become ubiquitous, computational research initiatives have spread into a wider variety of disciplines. With the variety of computing environments dramatically expanded, using available compute resources can be a much more complicated proposition. Additionally, users in disciplines that are not traditionally compute-heavy may not have experience with migrating an application from one computing environment to another. Thus, while more and faster resources should allow for more and better research to be carried out, the increase in resources can just as easily stymie progress. An ideal solution would enable computations to run on any available compute resource with minimal interaction from the user and would run a version of the application tuned for that particular site. In this work, we focus on the first goal. This step alone dramatically improves the ability of researchers to take advantage of the variety of computing resources available to them and, as a result, carry out more and better research.The work presented in this paper specifically focuses on increasing the ease-of-use of high performance computing clusters for running parallel computations coded using the MPI standard. We present methods that determine whether an HPC site is a good fit for running an MPI binary. We present a Linux-based implementation of our methods called FEAM (a Framework for Efficient Application Migration). FEAM predicts execution readiness, resolves missing shared libraries, and composes sitespecific configurations. We show that FEAM is more than 90% accurate at predicting execution readiness of MPI application binaries from the NAS Parallel and SPEC MPI2007 benchmark suites. We also show that by automatically resolving shared libraries requirements, FEAM is able to increase the number of successful executions by 41%.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.