The AXIOM project (Agile, eXtensible, fast I/O Module)

Theodoropoulos, Dimitris; Pnevmatikatos, D.; Martínez, Carlos Álvarez; Ayguadé, Eduard; Bueno, Javier; Filgueras, Antonio; Jiménez-González, Daniel; Martorell, Xavier; Navarro, Nacho; Segura, Carlos; Fernández, Carles; Oro, David; Saeta, Javier R.; Gai, Paolo; Rizzo, Antonio; Giorgi, Roberto

doi:10.1109/samos.2015.7363684

Cited by 14 publications

(6 citation statements)

References 16 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The AXIOM project is the context where this research is currently developed: other recent papers describe more in detail the hardware framework [2] and the software layers [3].…”

Section: Related Workmentioning

confidence: 99%

“…The hardware thread support is represented in Figure 1 by the eXtended Shared Memory (XSM) block. Standard high-speed and low-latency interconnections (e.g., PCIe 3.0) may provide enough bandwidth, but the exact interconnects is under exploration [2].…”

Section: Thread Managementmentioning

confidence: 99%

“…In the context of the project AXIOM [2], [3], [4] we are exploring the feasibility and trade-offs in designing and manufacturing a new Single Board Computer that could serve flexibly for a number of current and future applications. This board is based on FPGAs and embedded processors, e.g.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Scalable Embedded Systems: Towards the Convergence of High-Performance and Embedded Computing

Giorgi

2015

2015 IEEE 13th International Conference on Embedded and Ubiquitous Computing

Self Cite

View full text Add to dashboard Cite

Embedded System toolchains are highly customized for a specific System-on-Chip (SoC).\ud When the application needs more performance, the designer is typically forced\ud to adopt a new SoC and possibly another toolchain.\ud The rationale for not scaling performance by using, e.g., two SoCs, is\ud that maintining most of the operations on-chip may allow for higher energy efficiency.\ud We are exploring the feasibility and trade-offs of designing and manufacturing a new Single Board Computer (SBC) that could serve flexibly for a number of current and future applications, by allowing scalability through clusters of SBCs while keeping the same programming model for the SBC.\ud This board is based on FPGAs and embedded processors, and its key points are: i) a fast custom interconnect for board-to-board communication and ii) an easily programmable environment which would allow both the off-loading of code into accelerators (either soft-IP blocks or hard-IP blocks) and, at the same time, the distribution of computation across boards.\ud A key challenge to successfully deploying this paradigm is to properly distribute the threads across several boards without the explicit intervention of the programmer.\ud In this paper we describe how to dynamically and efficiently distribute the computational threads in symbiosis with an appropriate memory model to allow the system scalability, so that we can double the performance by simply connecting two boards without i) changing the basic hardware components (e.g., to a different System-On-Chip) and ii) changing the programming model to follow the vendor specific toolchain. Our approach is to reduce data movement across boards. Our initial experiments have confirmed the feasibility of our approach

show abstract

“…The AXIOM project is the context where this research is currently developed: other recent papers describe more in detail the hardware framework [2] and the software layers [3].…”

Section: Related Workmentioning

confidence: 99%

Section: Thread Managementmentioning

confidence: 99%

See 1 more Smart Citation

Scalable Embedded Systems: Towards the Convergence of High-Performance and Embedded Computing

Giorgi

2015

2015 IEEE 13th International Conference on Embedded and Ubiquitous Computing

Self Cite

View full text Add to dashboard Cite

show abstract

“…The AXIOM project (Agile, eXtensible, fast I/O Module) provides a general framework focusing on easily mapping applications to multi-board processing platforms [8,9] . Unlike other research effort s (such as CONTREX [10] , DREAMS [11] , EM C 2 [12] , MultiPARTES [13] ) that focus mainly on the mixed-criticality appli- cations, AXIOM provides a generic platform with its complete application development suite.…”

Section: Introductionmentioning

confidence: 99%

The AXIOM platform for next-generation cyber physical systems

Theodoropoulos

Mazumdar

Ayguadé

et al. 2017

Microprocessors and Microsystems

Self Cite

View full text Add to dashboard Cite

Cyber-Physical Systems (CPSs) are widely used in many applications that require interactions between humans and their physical environment. These systems usually integrate a set of hardware-software components for optimal application execution in terms of performance and energy consumption. The AXIOM project (Agile, eXtensible, fast I/O Module), presented in this paper, proposes a hardware-software platform for CPS coupled with an easy parallel programming model and sufficient connectivity so that the performance can scale-up by adding multiple boards. AXIOM supports a task-based programming model based on OmpSs and leverages a high-speed, inexpensive communication interface called AXIOM-Link. The board also tightly couples the CPU with reconfigurable resources to accelerate portions of the applications. As case studies, AXIOM uses smart video surveillance, and smart home living applicationsThis work is partially supported by the European Union H2020 program through the AXIOM project (grant ICT-01-2014 GA\ud 645496) and HiPEAC (GA 687698), by the Spanish Government through Programa Severo Ochoa (SEV-2015-0493), by the Spanish Ministry of Science and Technology through TIN2015-65316-P project, and by the Generalitat de Catalunya (contracts 2014-SGR-1051 and 2014-SGR-1272). We also thank the Xilinx University Program for its hardware and software donations.Peer ReviewedPostprint (author's final draft

show abstract

“…This section introduces the repercussion of this work in three different facets: 1) the European projects where this work has been, is being or will be used, 2) the influence on the evolution of different programming models, and 3) other Master's and PhD thesis where this work has a significant impact. AXIOM, Agile, eXtensible, fast I/O Module for the cyber-physical era AXIOM is a 3-year European project that started in 2015 [12,151]. It aims at researching new software/hardware architectures for the future Cyber-Physical Systems (CPSs).…”

Section: Impactmentioning

confidence: 99%

High-level compiler analysis for OpenMP

Royuela Alcázar

View full text Add to dashboard Cite

Nowadays, applications from dissimilar domains, such as high-performance computing and high-integrity systems, require levels of performance that can only be achieved by means of sophisticated heterogeneous architectures. However, the complex nature of such architectures hinders the production of efficient code at acceptable levels of time and cost. Moreover, the need for exploiting parallelism adds complications of its own (e.g., deadlocks, race conditions,...). In this context, compiler analysis is fundamental for optimizing parallel programs. There is however a trade-off between complexity and profit: low complexity analyses (e.g., reaching definitions) provide information that may be insufficient for many relevant transformations, and complex analyses based on mathematical representations (e.g., polyhedral model) give accurate results at a high computational cost. A range of parallel programming models providing different levels of programmability, performance and portability enable the exploitation of current architectures. However, OpenMP has proved many advantages over its competitors: 1) it delivers levels of performance comparable to highly tunable models such as CUDA and MPI, and better robustness than low level libraries such as Pthreads; 2) the extensions included in the latest specification meet the characteristics of current heterogeneous architectures (i.e., the coupling of a host processor to one or more accelerators, and the capability of expressing fine-grained, both structured and unstructured, and highly-dynamic task parallelism); 3) OpenMP is widely implemented by several chip (e.g., Kalray MPPA, Intel) and compiler (e.g., GNU, Intel) vendors; and 4) although currently the model lacks resiliency and reliability mechanisms, many works, including this thesis, pursue their introduction in the specification. This thesis addresses the study of compiler analysis techniques for OpenMP with two main purposes: 1) enhance the programmability and reliability of OpenMP, and 2) prove OpenMP as a suitable model to exploit parallelism in safety-critical domains. Particularly, the thesis focuses on the tasking model because it offers the flexibility to tackle the parallelization of algorithms with load imbalance, recursiveness and uncountable loop based kernels. Additionally, current works have proved the time-predictability of this model, shortening the distance towards its introduction in safety-critical domains. To enable the analysis of applications using the OpenMP tasking model, the first contribution of this thesis is the extension of a set of classic compiler techniques with support for OpenMP. As a basis for including reliability mechanisms, the second contribution consists of the development of a series of algorithms to statically detect situations involving OpenMP tasks, which may lead to a loss of performance, non-deterministic results or run-time failures. A well-known problem of parallel processing related to compilers is the static scheduling of a program represented by a directed graph. Although the literature is extensive in static scheduling techniques, the work related to the generation of the task graph at compile-time is very scant. Compilers are limited by the knowledge they can extract, which depends on the application and the programming model. The third contribution of this thesis is the generation of a predicated task dependency graph for OpenMP that can be interpreted by the runtime in such a way that the cost of solving dependences is reduced to the minimum. With the previous contributions as a basis for determining the functional safety of OpenMP, the final contribution of this thesis is the adaptation of OpenMP to the safety-critical domain considering two directions: 1) indicating how OpenMP can be safely used in such a domain, and 2) integrating OpenMP into Ada, a language widely used in the safety-critical domain. Actualment, aplicacions de dominis diversos com la computació d'altes prestacions i els sistemes d'alta integritat, requereixen nivells de rendiment assolibles només mitjançant arquitectures heterogènies sofisticades. No obstant, la natura complexa d'aquestes dificulta la producció de codi eficient en un temps i cost acceptables. A més, la necessitat d’explotar paral·lelisme introdueix complicacions en sí mateixa (p. ex. bloqueig mutu, condicions de carrera,...). En aquest context, l'anàlisi de compiladors és fonamental per optimitzar programes paral·lels. Existeix però un equilibri entre complexitat i beneficis: la informació obtinguda amb anàlisis simples (p. ex. definicions abastables) pot ser insuficient per moltes transformacions rellevants, i anàlisis complexos basats en models matemàtics (p. ex. model polièdric) faciliten resultats acurats a un alt cost computacional. Existeixen molts models de programació paral·lela que proporcionen diferents nivells de programabilitat, rendiment i portabilitat per l'explotació de les arquitectures actuals. En aquest marc, OpenMP ha demostrat molts avantatges respecte dels seus competidors: 1) el seu nivell de rendiment és comparable a models molt ajustables com CUDA i MPI, i proporciona més robustesa que llibreries de baix nivell com Pthreads; 2) les extensions que inclou la darrera especificació satisfan les característiques de les actuals arquitectures heterogènies (és a dir, l’acoblament d’un processador principal i un o més acceleradors, i la capacitat d'expressar paral·lelisme de tasques de gra fi, ja sigui estructurat o sense estructura; 3) OpenMP és àmpliament implementat per venedors de xips (p. ex. Kalray MPPA, Intel) i compiladors (p. ex. GNU, Intel); i 4) tot i que el model actual manca de mecanismes de resiliència i fiabilitat, molts treballs, incloent aquesta tesi, busquen la seva introducció a l'especificació. Aquesta tesi adreça l'estudi de tècniques d’anàlisi de compiladors amb dos objectius: 1) millorar la programabilitat i la fiabilitat de OpenMP, i 2) provar que OpenMP és un model adequat per explotar paral·lelisme en sistemes crítics. En particular, la tesi es centra en el model de tasques per què aquest ofereix la flexibilitat per abordar aplicacions amb problemes de balanceig de càrrega, recursivitat i bucles incomptables. A més, treballs recents han provat la predictibilitat en qüestió de temps del model, escurçant la distància cap a la seva introducció en sistemes crítics. Per a poder analitzar aplicacions que utilitzen el model de tasques d’OpenMP, la primera contribució d’aquesta tesi consisteix en l’extensió d'un conjunt de tècniques clàssiques de compilació per suportar OpenMP. Com a base per incloure mecanismes de fiabilitat, la segona contribució consisteix en el desenvolupament duna sèrie d'algorismes per detectar de forma estàtica situacions que involucren tasques d’OpenMP, i que poden conduir a una pèrdua de rendiment, resultats no deterministes, o fallades en temps d’execució. Un problema ben conegut del processament paral·lel relacionat amb els compiladors és la planificació estàtica d’un programa representat mitjançant un graf dirigit. Tot i que la literatura sobre planificació estàtica és extensa, aquella relacionada amb la generació del graf en temps de compilació és molt escassa. Els compiladors estan limitats pel coneixement que poden extreure, que depèn de l’aplicació i del model de programació. La tercera contribució de la tesi és la generació d’un graf de dependències enriquit que pot ser interpretat pel sistema en temps d’execució de manera que el cost de resoldre les dependències sigui mínim. Amb les anteriors contribucions com a base per a determinar la seguretat funcional de OpenMP, la darrera contribució de la tesi consisteix en adaptar OpenMP a sistemes crítics, explorant dues direccions: 1) indicar com OpenMP es pot utilitzar de forma segura en un domini com, i 2) integrar OpenMP en Ada, un llenguatge molt utilitzat en el domini de seguretat.

show abstract

The AXIOM project (Agile, eXtensible, fast I/O Module)

Cited by 14 publications

References 16 publications

Scalable Embedded Systems: Towards the Convergence of High-Performance and Embedded Computing

Scalable Embedded Systems: Towards the Convergence of High-Performance and Embedded Computing

The AXIOM platform for next-generation cyber physical systems

High-level compiler analysis for OpenMP

Contact Info

Product

Resources

About