Hardware performance counters are CPU registers that count data loads and stores, cache misses, and other events. Counter data can help programmers understand software performance. Although CPUs typically have multiple counters, each can monitor only one type of event at a time, and some counters can monitor only certain events. Therefore, some CPUs cannot concurrently monitor interesting combinations of events. Software multiplexing partly overcomes this limitation by using time sharing to monitor multiple events on one counter. However, counter multiplexing is harder to implement for multithreaded programs than for single-threaded ones because of certain difficulties in managing the length of the time slices.This paper describes a software library called MPX that overcomes these difficulties. MPX allows applications to gather hardware counter data concurrently for any combination of countable events. MPX data are typically within a few percent of counts recorded without multiplexing.
PerfTrack is a data store and interface for managing performance data from large-scale parallel applications. Data
IntroductionEffective performance tuning of large scale parallel and distributed applications on current and future clusters, supercomputers, and grids requires the ability to integrate performance data gathered with a variety of monitoring tools, stored in different formats, and possibly residing in geographically separate data stores. Performance data sharing between different performance studies or scientists is currently done manually or not done at all. Manual transfer of disorganized files with unique and varying storage formats is a time consuming and error prone approach. The granularity of exchange is often entire data sets, even if only a small subset of the transferred data is actually needed. All of these problems worsen as high end systems continue to scale up, greatly increasing the size of the data sets generated by performance studies. The overhead for sharing data discourages collaboration and data reuse. There are several key challenges associated with performance data sharing. First, performance tool output is not uniform, requiring the ability to translate between different metrics and tool outputs to compare the datasets. Second, the data sets resulting from performance studies of tera-and peta-scale level applications are potentially large and raise challenging scalability issues. For some types of summary data, database storage is a wellunderstood task; for others, such as trace data and data generated with a dynamic instrumentation based tool, further research is necessary to develop an efficient and flexible representation. In addition to increased ease of collaboration between scientists using or studying a common application, the potential benefits of a tool to collect
MapReduce-tailored distributed filesystems-such as HDFS for Hadoop MapReduce-and parallel high-performance computing filesystems are tailored for considerably different workloads. The purpose of our work is to examine the performance of each filesystem when both sorts of workload run on it concurrently.We examine two workloads on two filesystems. For the HPC workload, we use the IOR checkpointing benchmark and the Parallel Virtual File System, Version 2 (PVFS); for Hadoop, we use an HTTP attack classifier and the CloudStore filesystem. We analyze the performance of each file system when it concurrently runs its "native" workload as well as the non-native workload.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.