Rafael H. Saavedra scite author profile

Standard benchmarking provides the run times for given programs on given machines, but fails to provide insight as to why those results were obtained (either in terms of machine or program characteristics), and fails to provide run times for that program on some other machine, or some other programs on that machine. We have developed a machineindependent model of program execution to characterize both machine performance and program execution. By merging these machine and program characterizations, we can estimate execution time for arbitrary machine/program combinations. Our technique allows us to identify those operations, either on the machine or in the programs, which dominate the benchmark results. This information helps designers in improving the performance of future machines, and users in tuning their applications to better utilize the performance of existing machines.Here we apply our methodology to characterize benchmarks and predict their execution times. We present extensive run-time statistics for a large set of benchmarks including the SPEC and Perfect Club suites. We show how these statistics can be used to identify important shortcomings in the programs. In addition, we give execution time estimates for a large sample of programs and machines and compare these against benchmark results. Finally, we develop a metric for program similarity that makes it possible to classify benchmarks with respect to a large set of characteristics. † The material presented here is based on research supported principally by NASA under grant NCC2-550, and also in part by the National Science Foundation under grants MIP-8713274, MIP-9116578 and CCR-9117028, IntroductionBenchmarking is the process of running a specific program or workload on a specific machine or system, and measuring the resulting performance. This technique clearly provides an accurate evaluation of the performance of that machine for that workload. These benchmarks can either be complete applications [UCB87, Dong88, MIPS89], the most executed parts of a program (kernels) [Bail85,McMa86, Dodu89], or synthetic programs [Curn76,Weic88]. Unfortunately, benchmarking fails to provide insight as to why those results were obtained (either in terms of machine or program characteristics), and fails to provide run times for that program on some other machine, or some other program on that machine [Worl84, Dong87]. This is because benchmarking fails to characterize either the program or machine. In this paper we show that these limitations can be overcome with the help of a performance model based on the concept of a high-level abstract machine.Our machine model consists of a set of abstract operations representing, for some particular programming language, the basic operators and language constructs present in programs. A special benchmark called a machine characterizer is used to measure experimentally the time it takes to execute each abstract operation (AbOp). Frequency counts of AbOps are obtained by instrumenting and running benchmarks. The ma...

show abstract

Measuring cache and TLB performance and their effect on benchmark runtimes

Saavedra

Smith

1995

IEEE Trans. Comput.

View full text Add to dashboard Cite

In previous research, we have developed and presented a model for measuring machines and analyzing programs, and for accurately predicting the running time of any analyzed program on any measured machine. That work is extended here by: (a) developing a high level program to measure the design and performance of the cache and TLB for any machine; (b) using those measurements, along with published miss ratio data, to improve the accuracy of our run time predictions; (c) using our analysis tools and measurements to study and compare the design of several machines, with particular reference to their cache and TLB performance. As part of this work, we describe the design and performance of the cache and TLB for ten machines. The work presented in this paper extends a powerful technique for the evaluation and analysis of both computer systems and their workloads; this methodology is valuable both to computer users and computer system designers.

show abstract

Improving the effectiveness of software prefetching with adaptive executions

Saavedra

Park²

View full text Add to dashboard Cite

Performance and Optimization of Data Prefetching Strategies in Scalable Multiprocessors

Saavedra¹,

Mao²,

Hwang³

1994

Journal of Parallel and Distributed Computing

View full text Add to dashboard Cite

Adaptive granularity: Transparent integration of fine- and coarse-grain communication

Park

Saavedra

Moon

1997

Int J Parallel Prog

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.