Mitesh R. Meswani scite author profile

Die-stacked DRAM is a technology that will soon be integrated in high-performance systems. Recent studies have focused on hardware caching techniques to make use of the stacked memory, but these approaches require complex changes to the processor and also cannot leverage the stacked memory to increase the system's overall memory capacity. In this work, we explore the challenges of exposing the stacked DRAM as part of the system's physical address space. This non-uniform access memory (NUMA) styled approach greatly simplifies the hardware and increases the physical memory capacity of the system, but pushes the burden of managing the heterogeneous memory architecture (HMA) to the software layers. We first explore simple (and somewhat impractical) schemes to manage the HMA, and then refine the mechanisms to address a variety of hardware and software implementation challenges. In the end, we present an HMA approach with low hardware and software impact that can dynamically tune itself to different application scenarios, achieving performance even better than the (impractical-to-implement) baseline approaches.

show abstract

A new perspective on processing-in-memory architecture design

Zhang

Jayasena

Lyashevsky

et al. 2013

View full text Add to dashboard Cite

MemPod: A Clustered Architecture for Efficient and Scalable Migration in Flat Address Space Multi-level Memories

Prodromou

Meswani²,

Jayasena³

et al. 2017

View full text Add to dashboard Cite

Modeling and predicting performance of high performance computing applications on hardware accelerators

Meswani

Carrington

Unat

et al. 2012

The International Journal of High Performance Computing Applica

View full text Add to dashboard Cite

Abstract-Computers with hardware accelerators, also referred to as hybrid-core systems, speedup applications by offloading certain compute operations that can run faster on accelerators. Thus, it is not surprising that many of top500 supercomputers use accelerators. However, in addition to procurement cost, significant programming and porting effort is required to realize the potential benefit of such accelerators. Hence, before building such a system it is prudent to answer the question 'what is the projected performance benefit from accelerators for the workloads of interest?' We address this question by way of a performance-modeling framework that predicts realizable application performance on accelerators rapidly and accurately without going to the considerable effort of porting and tuning.The modeling framework first automatically identifies commonly found compute patterns in scientific applications which we term idioms, which may benefit by accelerator technology. Next the framework models the predicted speedup of those idioms if they were to be ported to and run on hardware accelerators. As a proof of concept we characterize two kinds of accelerators 1) the FPGA accelerators on a Convey HC-1 system and 2) an NVIDIA FERMI GPU accelerator. We model performance of the idioms gather/scatter and stream and our predictions show that where these occur in two full-scale HPC applications, Milc and HYCOM, gather/scatter speeds up by as much as 15X, and stream by as much as 14X, whereas the overall compute time of Milc improves by 3.4% and HYCOM by 20%. The cost of migrating data to/from the accelerator device can dwarf the benefit of speedup and hence we present models of data migration costs and its impact on the performance of Milc and HYCOM.

show abstract

Design and Analysis of an APU for Exascale Computing

Vijayaraghavany

Eckert

Loh

et al. 2017

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Mitesh R. Meswani

Heterogeneous memory architectures: A HW/SW approach for mixing die-stacked and off-package memories

A new perspective on processing-in-memory architecture design

MemPod: A Clustered Architecture for Efficient and Scalable Migration in Flat Address Space Multi-level Memories

Modeling and predicting performance of high performance computing applications on hardware accelerators

Design and Analysis of an APU for Exascale Computing

Contact Info

Product

Resources

About