A History-Based Performance Prediction Model with Profile Data Classification for Automatic Task Allocation in Heterogeneous Computing Systems

Sato, Kaito; Komatsu, Kazuhiko; Takizawa, Hiroyuki; Kobayashi, Hiroaki

doi:10.1109/ispa.2011.36

Cited by 11 publications

(4 citation statements)

References 7 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…However, accuracy is not that good, although still reasonable (between 15.8 and 27.3 percent), when estimating execution time on an unknown, new GPU. Finally, the paper presented by Sato et al [21] discusses different machine learning models, reporting for the best one error rates around 1 percent. However, the process used to calibrate the model is not detailed sufficiently.…”

Section: General-purpose Modelsmentioning

confidence: 99%

A Survey of Performance Modeling and Simulation Techniques for Accelerator-Based Computing

Lopez-Novoa

Mendiburu

Miguel-Alonso

2015

IEEE Trans. Parallel Distrib. Syst.

View full text Add to dashboard Cite

High performance computing platform is moving from homogeneous individual unites to heterogeneous systems. Where each unit is a combination of homogeneous cores and accelerator devices. Accelerator such as GPUs, FPGAs, DSPs, these devices usually designed for the specific and intensive type of computing tasks. The presence of these devices have created fresh and attractive development platforms for developers and designers, brand new performance analysis frameworks and optimization tools. This is the cutting edge in the performance of some accelerator devices like GPUs and Intel's Xeon Phi. We outline some of the existing heterogeneous systems and their development frameworks. The core of this study is a review of performance modeling of these devices. In this paper, we address the emerging issues that affect the performance of these devices and associated techniques employed for simulation and evaluation.

show abstract

Section: General-purpose Modelsmentioning

confidence: 99%

A Survey of Performance Modeling and Simulation Techniques for Accelerator-Based Computing

Lopez-Novoa

Mendiburu

Miguel-Alonso

2015

IEEE Trans. Parallel Distrib. Syst.

View full text Add to dashboard Cite

show abstract

“…This is also known as auto-tuning, which decides the granularity of a program running on multi-core processors. There are several work (Moore and Childers, 2012;Kurzak et al, 2012;Sato et al, 2011;Vuduc and Moon, 2005) discussing about auto-tuning of applications on multi-core processors. The previous work focus on explicitly searching the partially-pruned parameter space.…”

Section: Related Workmentioning

confidence: 99%

Reconfigurable multi-core architecture - a plausible solution to the von Neumann performance bottleneck

Lin

Chao

et al. 2015

IJAIS

View full text Add to dashboard Cite

The ill-famed von Neumann bottleneck has been the main performance hurdle since the invention of computers. Although several techniques such as separate data/instruction caches, branch prediction, and parallel computing have been proposed and improved efficiency, the throughput bottleneck between CPU and memory is still very much there. We propose a novel reconfigurable multi-core architecture (RMA) to address this issue via the dynamic allocation of heterogeneous computing resources and distributed memory. We show how this is feasible with the state-of-the-art technologies of dynamic partial reconfiguration of hardware resources and runtime operating system configuration. Experiments and analysis show how RMA alleviates the performance bottleneck. he is a Full Professor. His main research interests include: reconfigurable computing and system design, system-on-chip (SoC) design and verification, embedded software synthesis and verification, real-time system design and verification, hardware-software codesign and coverification, and component-based object oriented application frameworks for real-time embedded systems. This paper is a revised and expanded version of a paper entitled 'Reconfigurable multi-core architecture -a plausible solution to the von Neumann performance bottleneck' presented at 2013 IEEE 7th International Symposium on Embedded SoCs (MCSoC-13), Tokyo, Japan, 26-28 September 2013.

show abstract

“…The SPs have been used for CPU 2000 , and CPU 2006 programs . Different methods of performance prediction for task scheduling and parallelization have been discussed by Hauck et al , Sato et al , and Berube and Amaral .…”

Section: Related Workmentioning

confidence: 99%

Incorporating hardware and software features into a prediction model for processor‐system throughput

Beg

Prasad

2015

Comp Applic In Engineering

View full text Add to dashboard Cite

ABSTRACT:The cycle-accurate simulation is a method for design space study of a processor system before it goes for the hardware implementation. Even though the simulations provide precise results about the system performance, the simulation times are exorbitantly high for practical systems. Therefore, an alternative is to use experimentally-developed models that tend to be faster than the aforementioned simulations. This is an extension to our previous work done in the area of neural network models for assessing the processor system throughput. In addition to hardware-related parameters, the current work includes multiple software-related parameters to better represent the dynamic behavior of programs. Consequently, the model provides higher accuracy estimates of a widely used performance metric (instructions per cycle) when tested with industry standard CPU benchmark programs. Potential uses of the model are compiler design and computer architecture research and teaching.

show abstract

A History-Based Performance Prediction Model with Profile Data Classification for Automatic Task Allocation in Heterogeneous Computing Systems

Cited by 11 publications

References 7 publications

A Survey of Performance Modeling and Simulation Techniques for Accelerator-Based Computing

A Survey of Performance Modeling and Simulation Techniques for Accelerator-Based Computing

Reconfigurable multi-core architecture - a plausible solution to the von Neumann performance bottleneck

Incorporating hardware and software features into a prediction model for processor‐system throughput

Contact Info

Product

Resources

About