Space optimal solution for data reordering in streaming applications on NoC based MPSoC

Genius, Daniela; Kordon, Alix Munier; Abidine, Khouloud Zine El

doi:10.1016/j.sysarc.2013.04.001

Cited by 2 publications

(2 citation statements)

References 29 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…More generally, our work is also closely related to previous work on the design of NoCs with support for real-time and safety-critical applications [20], [21], [22], [23] and on application mapping onto many-core architectures [24], [25], [26], [27], [28], the difference being given by the integrated approach we use and by the statically scheduled NoC communications which ensure high timing precision and efficiency for the chosen class of applications.…”

Section: Related Workmentioning

confidence: 88%

Reconciling performance and predictability on a many-core through off-line mapping

Carle

Djemal

Genius

et al. 2014

2014 9th International Symposium on Reconfigurable and Communication-Centric Systems-on-Chip (ReCoSoC)

Self Cite

View full text Add to dashboard Cite

Abstract-We start from a general-purpose many-core architecture designed for average-case performance and ease of use. In particular, its distributed shared memory programming model allows the use of a code generation flow based on the (unmodified) gcc compiler chain. We modify this architecture and extend the code generation flow to allow the construction of efficient hard real-time systems starting from dependent task specifications. We rely on a static (off-line) real-time scheduling paradigm welladapted to embedded control and signal processing applications with regular control structure.We modify the architecture (and in particular the on-chip network) to allow the implementation of static schedules with very high (clock cycle) temporal precision. On the software side, we define application mapping rules ensuring that the timing precision provided by the hardware is not lost. These mapping rules include requirements on the allocation of data variables to specific RAM banks and on the use of locks to ensure the absence of contentions during access to shared resources. Applications complying with these rules can be written manually or automatically obtained using a new mapping tool that takes all the allocation and scheduling decisions. Compilation of the resulting C code is still done using the (unmodified) gcc compiler chain. The resulting platform provides good performance, and at the same provides very high timing precision, as shown by two case studies (an embedded controller and an implementation of the FFT).We conclude our paper with a presentation of some ongoing work on the subject: A case study (an implementation of the H.264 decoder) meant to test the limitations of our method.

show abstract

Section: Related Workmentioning

confidence: 88%

Reconciling performance and predictability on a many-core through off-line mapping

Carle

Djemal

Genius

et al. 2014

2014 9th International Symposium on Reconfigurable and Communication-Centric Systems-on-Chip (ReCoSoC)

Self Cite

View full text Add to dashboard Cite

show abstract

“…Similar approaches are taken in more dynamic techniques aimed at signal processing systems [8], [20], [7]. Techniques such as DOL [5] or the one of Zhai et al [47] allow for real-time mapping, but without considering the details of the NoC.…”

Section: Related Workmentioning

confidence: 99%

Static Mapping of Real-Time Applications onto Massively Parallel Processor Arrays

Carle

Djemal²,

Potop-Butucaru³

et al. 2014

2014 14th International Conference on Application of Concurrency to System Design

View full text Add to dashboard Cite

On-chip networks (NoCs) used in multiprocessor systems-on-chips (MPSoCs) pose significant challenges to both on-line (dynamic) and off-line (static) real-time scheduling approaches. They have large numbers of potential contention points, have limited internal buffering capabilities, and network control operates at the scale of small data packets. Therefore, efficient resource allocation requires requires scalable algorithms working on hardware models with a level of detail that is unprecedented in real-time scheduling. We consider here a static scheduling approach, and we target massively parallel processor arrays (MPPAs), which are MPSoCs with large numbers (hundreds) of processing cores. We first identify and compare the hardware mechanisms supporting precise timing analysis and efficient resource allocation in existing MPPA platforms. We determine that the NoC should ideally provide the means of enforcing a global communications schedule that is computed off-line (before execution) and which is synchronized with the scheduling of computations on processors. On the software side, we propose a novel allocation and scheduling method capable of synthesizing such global computation and communication schedules covering all the execution, communication, and memory resources in an MPPA. To allow an efficient use of the hardware resources, our method takes into account the specificities of MPPA hardware and implements advanced scheduling techniques such as software pipelining and pre-computed preemption of data transmissions. We evaluate our technique by mapping two signal processing applications, for which we obtain good latency, throughput, and resource use figures.

show abstract

Space optimal solution for data reordering in streaming applications on NoC based MPSoC

Cited by 2 publications

References 29 publications

Reconciling performance and predictability on a many-core through off-line mapping

Reconciling performance and predictability on a many-core through off-line mapping

Static Mapping of Real-Time Applications onto Massively Parallel Processor Arrays

Contact Info

Product

Resources

About