2011
DOI: 10.1109/tpds.2010.106
|View full text |Cite
|
Sign up to set email alerts
|

Automatic Generation of Multicore Chemical Kernels

Abstract: This work presents the Kinetics PreProcessor: Accelerated (KPPA), a general analysis and code generation tool that achieves significantly reduced time-to-solution for chemical kinetics kernels on three multicore platforms: NVIDIA GPUs using CUDA, the Cell Broadband Engine, and Intel Quad-Core Xeon CPUs. A comparative performance analysis of chemical kernels from WRFChem and the Community Multiscale Air Quality Model (CMAQ) is presented for each platform in double and single precision on coarse and fine grids. … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
13
0

Year Published

2011
2011
2022
2022

Publication Types

Select...
6
1

Relationship

1
6

Authors

Journals

citations
Cited by 18 publications
(13 citation statements)
references
References 32 publications
0
13
0
Order By: Relevance
“…This section presents parallelization strategies for Rosenbrock integration in one-cell-per-thread, N-cells-per-thread, and 4/2-cells-per-thread decompositions. For performance benchmarks of parallelized Rosenbrock solvers, see [32][33][34]62].…”
Section: Acceleration Aspectsmentioning
confidence: 99%
See 3 more Smart Citations
“…This section presents parallelization strategies for Rosenbrock integration in one-cell-per-thread, N-cells-per-thread, and 4/2-cells-per-thread decompositions. For performance benchmarks of parallelized Rosenbrock solvers, see [32][33][34]62].…”
Section: Acceleration Aspectsmentioning
confidence: 99%
“…KPPA (the Kinetics PreProcessor: Accelerated) [32], is the next generation KPP tool that achieves significantly reduced time-to-solution for chemical kinetics kernels on both traditional and emerging architectures. In addition to the basic KPP functionality, KPPA generates OpenMP code with SSE or Alitivec for traditional CPUs, CUDA code for NVIDIA GPUs, and optimized C codes for the Cell Broadband Engine Architecture (CBEA), in either double or single precision.…”
Section: The Kinetic Preprocessor: Accelerated (Kppa)mentioning
confidence: 99%
See 2 more Smart Citations
“…To date, numerous automatic CPU-to-GPU source parallelization translation tools [9][10][11][12][13][14][15][16][17][18][19][20][21][22][23][24][25][26][27], including algorithmic skeleton based [14][15][16], polyhedral model based [9][10][11][12][13], or directive based [17][18][19][20][21][22][23] have been developed for academic and commercial use. While their acceleration is promising, utilizing them by normal users in general real-word applications is still challenging.…”
Section: Introductionmentioning
confidence: 99%