DwarfCode: A Performance Prediction Tool for Parallel Applications

Zhang, Weizhe; Cheng, Albert M. K.; Subhlok, Jaspal

doi:10.1109/tc.2015.2417526

Cited by 27 publications

(10 citation statements)

References 42 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…A wide range of techniques have been proposed to accelerate convolution operations [1], [2], [3], [4], [5], [6], [7], [8]. Among these methods, general matrix multiplication (GEMM) [6], [7], fast fourier transform (FFT) [2] and winograd [3] methods are the broadly adopted ones.…”

Section: A Column Reuse Optimization 1) Standard Convolutionmentioning

confidence: 99%

Optimizing GPU Memory Transactions for Convolution Operations

Zhang

Wang

2020

2020 IEEE International Conference on Cluster Computing (CLUSTER)

Self Cite

View full text Add to dashboard Cite

Section: A Column Reuse Optimization 1) Standard Convolutionmentioning

confidence: 99%

Optimizing GPU Memory Transactions for Convolution Operations

Zhang

Wang

2020

2020 IEEE International Conference on Cluster Computing (CLUSTER)

Self Cite

View full text Add to dashboard Cite

“…It seems that these applications are not suitable for heterogeneous platforms. To solve this problem, a promising approach is to make the translator can predict appli-cation performances on different platforms [35,36,32,6], and automatically decide whether to offload or not.…”

Section: Performance Of Oao Versionmentioning

confidence: 99%

Automatic translation of data parallel programs for heterogeneous parallelism through OpenMP offloading

et al. 2020

Self Cite

View full text Add to dashboard Cite

show abstract

“…Trace-based methods: Trace-driven methods, which are frequently used in simulators [8], [7] and benchmark generation tools [24], [25], can capture detailed performance behavior and model the performance of parallel programs automatically. However, trace-driven methods have some limitations.…”

Section: Related Workmentioning

confidence: 99%

Multi-Parameter Performance Modeling Based on Machine Learning with Basic Block Features

Hao

Zhang

Wang

et al. 2019

2019 IEEE Intl Conf on Parallel &Amp; Distributed Processing With Applications, Big Data &Amp; Cloud Computing, Sustainable Com

Self Cite

View full text Add to dashboard Cite

Considering the increasing complexity and scale of HPC architecture and software, the performance modeling of parallel applications on large-scale HPC platforms has become increasingly important. It plays an important role in many areas, such as performance analysis, job management, and resource estimation. In this work, we propose a performance modeling and prediction framework called SmartPred, which utilizes basic block frequencies as features and uses machine learning algorithms to automatically construct multi-parameter performance models with high generalization ability. To reduce the prediction overhead, we propose some feature-filtering strategies to reduce the number of features in the training stage and build a serial program called BBF collector for each target application to quickly collect feature values in the prediction stage. We demonstrate the use of SmartPred on the TianHe-2 supercomputer with six parallel applications. Results show that SmartPred with SVR achieves better prediction than other input parameter-based modeling methods. The average prediction error and average standard deviation of prediction errors of SmartPred are 8.42% and 6.09%, respectively. In the prediction stage, the average prediction overhead of SmartPred is less than 0.13% of the total execution time.

show abstract

DwarfCode: A Performance Prediction Tool for Parallel Applications

Cited by 27 publications

References 42 publications

Optimizing GPU Memory Transactions for Convolution Operations

Optimizing GPU Memory Transactions for Convolution Operations

Automatic translation of data parallel programs for heterogeneous parallelism through OpenMP offloading

Multi-Parameter Performance Modeling Based on Machine Learning with Basic Block Features

Contact Info

Product

Resources

About