Pedro Benedicte scite author profile

Kosmidis

Quiñones

et al. 2016

Abstract-Obtaining Worst-Case Execution Time (WCET) estimates is a required step in real-time embedded systems during software verification. Measurement-Based Probabilistic Timing Analysis (MBPTA) aims at obtaining WCET estimates for industrial-size software running upon hardware platforms comprising high-performance features.MBPTA relies on the randomization of timing behavior (functional behavior is left unchanged) of hard-to-predict events like the location of objects in memory -and hence their associated cache behavior -that significantly impact software's WCET estimates. Software time-randomized caches (sTRc) have been recently proposed to enable MBPTA on top of Commercial off-the-shelf (COTS) caches (e.g. modulo placement). However, some random events may challenge MBPTA reliability on top of sTRc. In this paper, for sTRc and programs with homogeneously accessed addresses, we determine whether the number of observations taken at analysis, as part of the normal MBPTA application process, captures the cache events significantly impacting execution time and WCET. If this is not the case, our techniques provide the user with the number of extra runs to perform to guarantee that cache events are captured for a reliable application of MBPTA. Our techniques are evaluated with synthetic benchmarks and an avionics application.

Modelling the confidence of timing analysis for time randomised caches

Kosmidis

Quiñones

et al. 2016

Abstract-Timing is a key non-functional property in embedded real-time systems (ERTS). ERTS increasingly require higher levels of performance that can only be sensibly provided by deploying high-performance hardware, which however complicates timing analysis. Measurement-Based Probabilistic Timing Analysis (MBPTA) aims at analysing the timing behaviour of ERTS deploying complex hardware features such as caches. A key parameter for MBPTA to provide reliable results is the number of runs to perform to ensure probabilistic representativeness of the execution time measurements taken at analysis time with respect to execution times that can occur during system operation. In this paper, focusing on the cache -acknowledged as one of the most complex resources to time analyse -we address the problem of determining whether the number of observations taken at analysis, as part of the normal MBPTA application process, captures the cache events significantly impacting execution time and Worst-Case Execution Time (WCET). If this is not the case, our techniques provide the user with the number of extra runs to perform to guarantee that those cache events are captured ensuring confidence on provided WCET estimates.

Performance Analysis and Optimization of Automotive GPUs

Mazzocchetti

Tabani

et al. 2019

Advanced Driver Assistance Systems (ADAS) and Autonomous Driving (AD) have drastically increased the performance demands of automotive systems. Suitable highperformance platforms building upon Graphic Processing Units (GPUs) have been developed to respond to this demand, being NVIDIA Jetson TX2 a relevant representative. However, whether high-performance GPU configurations are appropriate for automotive setups remains as an open question. This paper aims at providing light on this question by modelling an automotive GPU (Jetson TX2), analyzing its microarchitectural parameters against relevant benchmarks, and identifying specific configurations able to meaningfully increase performance within similar cost envelopes, or to decrease costs preserving original performance levels. Overall, our analysis opens the door to the optimization of automotive GPUs for further system efficiency.

LAEC: Look-Ahead Error Correction Codes in Embedded Processors L1 Data Cache

Hernández

Abella

et al. 2019

As implementation technology shrinks, the presence of errors in cache memories is becoming an increasing issue in all computing domains. Critical systems, e.g. space and automotive, are specially exposed and susceptible to reliability issues. Furthermore, hardware designs in these systems are migrating to multi-level cache multicore systems, in which writethrough first level data (DL1) caches have been shown to heavily harm average and guaranteed performance. While write-back (DL1) caches solve this problem they come with their own challenges: they need Error Correction Codes (ECC) to tolerate soft errors, but implementing DL1 ECC in simple embedded micro-controllers requires either complex hardware to squash instructions consuming erroneous data, or delayed delivery of data to correct potential errors, which impacts performance even if such process is pipelined. In this paper we present a lowcomplexity hardware mechanism to anticipate data fetch and error correction in DL1 so that both (1) correct data is always delivered, but (2) avoiding additional delays in most of the cases. This achieves both high guaranteed performance and an effective solutions against errors.