Antoni Navarro scite author profile

Antoni Navarro

5Publications

11Citation Statements Received

46Citation Statements Given

How they've been cited

How they cite others

100

Affiliations

Barcelona Supercomputing Center, Universitat Politècnica de Catalunya

Publications

Order By: Most citations

Adaptive and Architecture-Independent Task Granularity for Recursive Applications

Navarro

Mateo

Pérez

et al. 2017

View full text Add to dashboard Cite

In the last few decades, modern applications have become larger and more complex. Among the users of these applications, the need to simplify the process of identifying units of work increased as well. With the approach of tasking models, this want has been satisfied. These models make scheduling units of work much more user-friendly. However, with the arrival of tasking models, came granularity management. Discovering an application's optimal granularity is a frequent and sometimes challenging task for a wide range of recursive algorithms. Often, finding the optimal granularity will cause a substantial increase in performance. With that in mind, the quest for optimality is no easy task. Many aspects have to be considered that are directly related to lack or excess of parallelism in applications. There is no general solution as the optimal granularity depends on both algorithm and system characteristics. One commonly used method to find an optimal granularity consists in experimentally tuning an application with different granularities until an optimal is found. This paper proposes several heuristics which, combined with the appropriate monitoring techniques, allow a runtime system to automatically tune the granularity of recursive applications. The solution is independent of the architecture, execution environment or application being tested. A reference implementation in OmpSs-a task-parallel programming model-shows the programmability, ease of use and competitive performance of the proposed solution. Results show that the proposed solution is able to achieve, for any scenario, at least 75% of the performance of optimally tuned applications.

show abstract

Enhancing Resource Management Through Prediction-Based Policies

Navarro

Lorenzon

Ayguadé

et al. 2020

View full text Add to dashboard Cite

Aging-Aware Parallel Execution

Medeiros

Berned

Navarro

et al. 2021

IEEE Embedded Syst. Lett.

View full text Add to dashboard Cite

Computation has been pushed to the edge to decrease latency and alleviate the computational burden of the IoT applications in the cloud. However, the increasing processing demands of Edge Applications make necessary the employment of platforms that exploit thread-level parallelism (TLP). Yet, power and heat dissipation rise as TLP inadvertently increases or when parallelism is not cleverly exploited, which may be the result of the non-ideal use of a given PPI (Parallel Program Interface). Besides the common issues, such as the need for more robust power sources and better cooling, heat also adversely affects aging, accelerating phenomenons such as negative bias temperature instability (NBTI) and hot-carrier injection (HCI), which further reduces processor lifetime. Hence, considering that increasing the lifespan of an edge device is key, so the number of times the application set may execute until its end-of-life is maximized, we propose BALDER. It is a learning framework capable of automatically choosing optimal configuration executions (PPI and number of threads) according to the parallel application at hand, aiming to maximize the trade-off between aging and performance. When executing ten well-known applications on two multicore embedded architectures, we show that BALDER can find a nearly-optimal configuration for all our experiments.

show abstract

Seamless optimization of the GEMM kernel for task-based programming models

Lorenzon

Marques

Navarro

et al. 2022

View full text Add to dashboard Cite

The general matrix-matrix multiplication (GEMM) kernel is a fundamental building block of many scientific applications. Many libraries such as Intel MKL and BLIS provide highly optimized sequential and parallel versions of this kernel. The parallel implementations of the GEMM kernel rely on the well-known fork-join execution model to exploit multi-core systems efficiently. However, these implementations are not well suited for task-based applications as they break the data-flow execution model. In this paper, we present a task-based implementation of the GEMM kernel that can be seamlessly leveraged by task-based applications while providing better performance than the fork-join version. Our implementation leverages several advanced features of the OmpSs-2 programming model and a new heuristic to select the best parallelization strategy and blocking parameters based on the matrix and hardware characteristics. When evaluating the performance and energy consumption on two modern multi-core systems, we show that our implementations provide significant performance improvements over an optimized OpenMP fork-join implementation, and can beat vendor implementations of the GEMM (e.g., Intel MKL and AMD AOCL). We also demonstrate that a real application can leverage our optimized task-based implementation to enhance performance. CCS CONCEPTS• Computing methodologies → Massively parallel algorithms; • Theory of computation → Parallel computing models.

show abstract

An Application-Driven Approach to Mitigate Aging by Tuning the TLP and Allocation Strategies

Medeiros

Schwarzrock

Navarro

et al. 2020

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Antoni Navarro

Adaptive and Architecture-Independent Task Granularity for Recursive Applications

Enhancing Resource Management Through Prediction-Based Policies

Aging-Aware Parallel Execution

Seamless optimization of the GEMM kernel for task-based programming models

An Application-Driven Approach to Mitigate Aging by Tuning the TLP and Allocation Strategies

Contact Info

Product

Resources

About