Proceedings of the 48th International Symposium on Microarchitecture 2015
DOI: 10.1145/2830772.2830780
|View full text |Cite
|
Sign up to set email alerts
|

Cross-architecture performance prediction (XAPP) using CPU code to predict GPU performance

Abstract: GPUs have become prevalent and more general purpose, but GPU programming remains challenging and time consuming for the majority of programmers. In addition, it is not always clear which codes will benefit from getting ported to GPU. Therefore, having a tool to estimate GPU performance for a piece of code before writing a GPU implementation is highly desirable. To this end, we propose Cross-Architecture Performance Prediction (XAPP), a machine-learning based technique that uses only single-threaded CPU impleme… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
42
0
1

Year Published

2017
2017
2022
2022

Publication Types

Select...
7
2
1

Relationship

0
10

Authors

Journals

citations
Cited by 89 publications
(43 citation statements)
references
References 44 publications
0
42
0
1
Order By: Relevance
“…Reference [7] uses both static attributes such as numbers of different types of instructions and dynamic attributes such as number of cache misses for regression. Reference [2] proposes cross-architecture performance prediction. It is a machine training based technique using both static and dynamic attributes from many programs from some/different benchmarks.…”
Section: Related Workmentioning
confidence: 99%
“…Reference [7] uses both static attributes such as numbers of different types of instructions and dynamic attributes such as number of cache misses for regression. Reference [2] proposes cross-architecture performance prediction. It is a machine training based technique using both static and dynamic attributes from many programs from some/different benchmarks.…”
Section: Related Workmentioning
confidence: 99%
“…Ardalani et al also used machine learning to train GPU performance models [7]. Their modeling included two techniques: the forward feature selection stepwise regression and the bootstrap aggregating.…”
Section: A Pipeline Analysismentioning
confidence: 99%
“…Each slice is decomposed into three subgroups: (1) the Slice Common (Figure 8) which provides additional fixed function architectural units; (2) the Sub-Slice (Figure 9) which contains 24 Execution Units (EUs) and supporting execution hardware; and (3) an L3 cache. RastSim models only the portions of the Slice Common and Sub-Slice that are needed to provide functionally correct rendering.…”
Section: Slice Architecturementioning
confidence: 99%