John May scite author profile

John May

5Publications

35Citation Statements Received

23Citation Statements Given

How they've been cited

How they cite others

Affiliations

Lawrence Livermore National Laboratory, University of California, San Diego

Publications

Order By: Most citations

MPX: Software for multiplexing hardware performance counters in multithreaded programs

May

View full text Add to dashboard Cite

Hardware performance counters are CPU registers that count data loads and stores, cache misses, and other events. Counter data can help programmers understand software performance. Although CPUs typically have multiple counters, each can monitor only one type of event at a time, and some counters can monitor only certain events. Therefore, some CPUs cannot concurrently monitor interesting combinations of events. Software multiplexing partly overcomes this limitation by using time sharing to monitor multiple events on one counter. However, counter multiplexing is harder to implement for multithreaded programs than for single-threaded ones because of certain difficulties in managing the length of the time slices.This paper describes a software library called MPX that overcomes these difficulties. MPX allows applications to gather hardware counter data concurrently for any combination of countable events. MPX data are typically within a few percent of counts recorded without multiplexing.

show abstract

Pianola: A script-based I/O benchmark

May

2008

View full text Add to dashboard Cite

Integrating Database Technology with Comparison-based Parallel Performance Diagnosis: The PerfTrack Performance Experiment Management Tool

Karavanic

May

Mohror

et al.

View full text Add to dashboard Cite

PerfTrack is a data store and interface for managing performance data from large-scale parallel applications. Data IntroductionEffective performance tuning of large scale parallel and distributed applications on current and future clusters, supercomputers, and grids requires the ability to integrate performance data gathered with a variety of monitoring tools, stored in different formats, and possibly residing in geographically separate data stores. Performance data sharing between different performance studies or scientists is currently done manually or not done at all. Manual transfer of disorganized files with unique and varying storage formats is a time consuming and error prone approach. The granularity of exchange is often entire data sets, even if only a small subset of the transferred data is actually needed. All of these problems worsen as high end systems continue to scale up, greatly increasing the size of the data sets generated by performance studies. The overhead for sharing data discourages collaboration and data reuse. There are several key challenges associated with performance data sharing. First, performance tool output is not uniform, requiring the ability to translate between different metrics and tool outputs to compare the datasets. Second, the data sets resulting from performance studies of tera-and peta-scale level applications are potentially large and raise challenging scalability issues. For some types of summary data, database storage is a wellunderstood task; for others, such as trace data and data generated with a dynamic instrumentation based tool, further research is necessary to develop an efficient and flexible representation. In addition to increased ease of collaboration between scientists using or studying a common application, the potential benefits of a tool to collect

show abstract

Mixing Hadoop and HPC workloads on parallel filesystems

Molina-Estolano

Gokhale

Maltzahn

et al. 2009

View full text Add to dashboard Cite

MapReduce-tailored distributed filesystems-such as HDFS for Hadoop MapReduce-and parallel high-performance computing filesystems are tailored for considerably different workloads. The purpose of our work is to examine the performance of each filesystem when both sorts of workload run on it concurrently.We examine two workloads on two filesystems. For the HPC workload, we use the IOR checkpointing benchmark and the Parallel Virtual File System, Version 2 (PVFS); for Hadoop, we use an HTTP attack classifier and the CloudStore filesystem. We analyze the performance of each file system when it concurrently runs its "native" workload as well as the non-native workload.

show abstract

Storage-Intensive Supercomputing Benchmark Study

Cohen¹,

Dossa²,

Gokhale³

et al. 2007

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

John May

MPX: Software for multiplexing hardware performance counters in multithreaded programs

Pianola: A script-based I/O benchmark

Integrating Database Technology with Comparison-based Parallel Performance Diagnosis: The PerfTrack Performance Experiment Management Tool

Mixing Hadoop and HPC workloads on parallel filesystems

Storage-Intensive Supercomputing Benchmark Study

Contact Info

Product

Resources

About