Marcos Amarís scite author profile

Marcos Amarís

3Publications

79Citation Statements Received

70Citation Statements Given

How they've been cited

How they cite others

Affiliations

Federal University of Para, Universidade de São Paulo, Grenoble Alpes University

Publications

Order By: Most citations

Generic Algorithms for Scheduling Applications on Hybrid Multi-core Machines

Amarís

Lucarelli

Mommessin

et al. 2017

View full text Add to dashboard Cite

We study the problem of executing an application represented by a precedence task graph on a multi-core machine composed of standard computing cores and accelerators. Contrary to most existing approaches, we distinguish the allocation and the scheduling phases and we mainly focus on the allocation part of the problem: choose the more appropriate type of computing unit for each task. We address both off-line and on-line settings. In the first case, we establish strong lower bounds on the worst-case performance of a known approach based on Linear Programming for solving the allocation problem. Then, we refine the scheduling phase and we replace the greedy list scheduling policy used in this approach by a better ordering of the tasks. Although this modification leads to the same approximability guarantees, it performs much better in practice. We also extend this algorithm to more types of heterogeneous cores, achieving an approximation ratio which depends on the number of different types. In the online case, we assume that the tasks arrive in any, not known in advance, order which respects the precedence relations and the scheduler has to take irrevocable decisions about their allocation and execution. In this setting, we propose the first scheduling algorithm with precedences based on adequate rules for selecting the type of processor where to allocate the tasks. This algorithm achieves a constant factor approximation guarantee if the ratio of the number of CPUs over the number of GPUs is bounded. Finally, all the previous algorithms have been experimented on a large number of simulations built upon actual libraries. These simulations assess the good practical behavior of the algorithms with respect to the state-of-the-art solutions whenever these exist or baseline algorithms.

show abstract

A comparison of GPU execution time prediction using machine learning and analytical modeling

Amarís

Camargo

Dyab

et al. 2016

View full text Add to dashboard Cite

A Simple BSP-based Model to Predict Execution Time in GPU Applications

Amarís

Cordeiro

Goldman

et al. 2015

View full text Add to dashboard Cite

Abstract-Models are useful to represent abstractions of software and hardware processes. The Bulk Synchronous Parallel (BSP) is a bridging model for parallel computation that allows algorithmic analysis of programs on parallel computers using performance modeling. The main idea of BSP model is the treatment of communication and computation as abstractions of a parallel system. Meanwhile, the use of GPU devices are becoming more widespread and they are currently capable of performing efficient parallel computation for applications that can be decomposed on thousands of simple threads. However, few models for predicting application execution time on GPUs have been proposed.In this work we present a simple and intuitive BSP-based model for predicting the CUDA application execution times on GPUs. The model is based on the number of computations and memory accesses of the GPU, with additional information on cache usage obtained from profiling. Scalability, divergence, effect of optimizations and differences of architectures are adjusted by a single parameter. We evaluated our model using two applications and six different boards. We showed by using profile information for a single board, that the model is general enough to predict the execution time of an application with different input sizes and on different boards with the same architecture. Our model predictions were within 0.8 to 1.2 times the measured execution times, which are reasonable for such a simple model. These results indicate that the model is good enough to generalize the predictions for different problem sizes and GPU configurations.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Marcos Amarís

Generic Algorithms for Scheduling Applications on Hybrid Multi-core Machines

A comparison of GPU execution time prediction using machine learning and analytical modeling

A Simple BSP-based Model to Predict Execution Time in GPU Applications

Contact Info

Product

Resources

About