Haihang You scite author profile

Modern high performance computer systems continue to increase in size and complexity. Tools to measure application performance in these increasingly complex environments must also increase the richness of their measurements to provide insights into the increasingly intricate ways in which software and hardware interact. PAPI (the Performance API) has provided consistent platform and operating system independent access to CPU hardware performance counters for nearly a decade. Recent trends toward massively parallel multi-core systems with often heterogeneous architectures present new challenges for the measurement of hardware performance information, which is now available not only on the CPU core itself, but scattered across the chip and system. We discuss the evolution of PAPI into Component PAPI, or PAPI-C, in which multiple sources of performance data can be measured simultaneously via a common software interface. Several examples of components and component data measurements are discussed. We explore the challenges to hardware performance measurement in existing multi-core architectures. We conclude with an exploration of future directions for the PAPI interface.

show abstract

POET: Parameterized Optimizations for Empirical Tuning

Seymour

You

et al. 2007

View full text Add to dashboard Cite

show abstract

Principles and construction of MSD adder in ternary optical computer

Shen

Peng

et al. 2010

Sci. China Inf. Sci.

View full text Add to dashboard Cite

The two remarkable features of ternary values and a massive unit with thousands bits of parallel computation will make the ternary optical computer (TOC) with modified signed-digit (MSD) adder more powerful and efficient than ever before for numerical calculations. Based on the decrease-radix design presented previously, a TOC can satisfy either a user requiring huge capacity for data calculations or one with a moderate amount of data, if it is equipped with a prepared adder. Furthermore, with the application of pipelined operations and the proposed data editing technique, the efficiency of the prepared adder can be greatly improved, so that each calculated result can be obtained almost within one clock cycle. It is hopeful that by employing a MSD adder, users will be able to enter a new dimension with the creation of a new multiplier, new divider, as well as new matrix operator in a TOC in the near future.With the current rapid increase in the complexity of computer architectures, the power consumption of large scale systems has risen prohibitively. Much attention has been focused on reducing the power consumption in different ways. One of the ways of solving the problem is to use of an optical computer with its special non-electron characteristics of high speed, parallelism, multi-valued, and low power consumption. Considering these properties, researchers have been focusing mainly on improving the operating speed [1-3] and enlarging the number of parallel bits in these computers [4][5][6], but have often neglected the problem of reducing the power consumption.A TOC prototype recently developed in our laboratory at Shanghai University is a typical optical computer with a huge number of data bits [6,7]. Based on the decrease-radix design proposed in 2008 [8], we can configure any number of bits as specific groups of tri-valued logic units at any time in the TOC. However, as thousands of bits exist in an adder, the ripple-carry technique is infeasible in a TOC because of the terrible carry delay. In addition, the look-ahead carry technique does not suit the construction of optical elements due to the high complexity of its tree type architecture. For these reasons, we proposed a new technique called the direct parallel carry channel (DPCC) aiming at accelerating the carry operation [9]. Unfortunately, this scheme has failed to be put into practice for various reasons.

show abstract

A comparison of search heuristics for empirical code optimization

Seymour

You

Dongarra

2008

View full text Add to dashboard Cite

This paper describes the application of various search techniques to the problem of automatic empirical code optimization. The search process is a critical aspect of auto-tuning systems because the large size of the search space and the cost of evaluating the candidate implementations makes it infeasible to find the true optimum point by brute force. We evaluate the effectiveness of Nelder-Mead Simplex, Genetic Algorithms, Simulated Annealing, Particle Swarm Optimization, Orthogonal search, and Random search in terms of the performance of the best candidate found under varying time limits.

show abstract

Experiences and lessons learned with a portable interface to hardware performance counters

Dongarra

London

Moore

et al.

View full text Add to dashboard Cite

The PAPI project has defined and implemented a crossplatform interface to the hardware counters available on most modern microprocessors. The interface has gained widespread use and acceptance from hardware vendors, users, and tool developers. This paper reports on experiences with the community-based open-source effort to define the PAPI specification and implement it on a variety of platforms. Collaborations with tool developers who have incorporated support for PAPI are described. Issues related to interpretation and accuracy of hardware counter data and to the overheads of collecting this data are discussed. The paper concludes with implications for the design of the next version of PAPI.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Haihang You

Collecting Performance Data with PAPI-C

POET: Parameterized Optimizations for Empirical Tuning

Principles and construction of MSD adder in ternary optical computer

A comparison of search heuristics for empirical code optimization

Experiences and lessons learned with a portable interface to hardware performance counters

Contact Info

Product

Resources

About