Abstract-MapReduce systems face enormous challenges due to increasing growth, diversity, and consolidation of the data and computation involved. Provisioning, configuring, and managing large-scale MapReduce clusters require realistic, workloadspecific performance insights that existing MapReduce benchmarks are ill-equipped to supply.In this paper, we build the case for going beyond benchmarks for MapReduce performance evaluations. We analyze and compare two production MapReduce traces to develop a vocabulary for describing MapReduce workloads. We show that existing benchmarks fail to capture rich workload characteristics observed in traces, and propose a framework to synthesize and execute representative workloads. We demonstrate that performance evaluations using realistic workloads gives cluster operator new ways to identify workload-specific resource bottlenecks, and workload-specific choice of MapReduce task schedulers.We expect that once available, workload suites would allow cluster operators to accomplish previously challenging tasks beyond what we can now imagine, thus serving as a useful tool to help design and manage MapReduce systems.
Abstract-A recent trend for data-intensive computations is to use pay-as-you-go execution environments that scale transparently to the user. However, providers of such environments must tackle the challenge of configuring their system to provide maximal performance while minimizing the cost of resources used. In this paper, we use statistical models to predict resource requirements for Cloud computing applications. Such a prediction framework can guide system design and deployment decisions such as scale, scheduling, and capacity. In addition, we present initial design of a workload generator that can be used to evaluate alternative configurations without the overhead of reproducing a real workload. This paper focuses on statistical modeling and its application to data-intensive workloads.
Computer-system logs provide a glimpse into the states of a running system. Instrumentation occasionally generates short messages that are collected in a system-specific log. The content and format of logs can vary widely from one system to another and even among components within a system. A printer driver might generate messages indicating that it had trouble communicating with the printer, while a Web server might record which pages were requested and when.As the content of the logs is varied, so are their uses. The printer log might be used for troubleshooting, while the Web-server log is used to study traffic patterns to maximize advertising revenue. Indeed, a single log may be used for multiple purposes: information about the traffic along different network paths, called flows, might help a user optimize network performance or detect a malicious intrusion; or call-detail records can monitor who called whom and when, and upon further analysis can reveal call volume and drop rates within entire cities.This article provides an overview of some of the most common applications of log analysis, describes some of the logs that might be analyzed and the methods of analyzing them, and elucidates some of the lingering challenges. Log analysis is a rich field of research; while it is not our goal to provide a literature survey, we do intend to provide a clear understanding of why log analysis is both vital and difficult. DEBUGGINGMany logs are intended to facilitate debugging. As Brian Kernighan wrote in Unix for Beginners in 1979, "The most effective debugging tool is still careful thought, coupled with judiciously placed print statements." Although today's programs are orders of magnitude more complex than those of 30 years ago, many people still log using printf to console or local disk and use some combination of manual inspection and regular expressions to locate specific messages or patterns.The simplest and most common use for a debug log is to grep for a specific message. If a server operator believes that a program crashed because of a network failure, then he or she might try to find a "connection dropped" message in the server logs. In many cases, it is difficult to figure out what to search for, as there is no well-defined mapping between log messages and observed symptoms. When a Web service suddenly becomes slow, the operator is unlikely to see an obvious error message saying, "ERROR: The service latency increased by 10% because bug X, on line Y, was triggered." Instead, users often perform a search for severity keywords such as "error" or "failure." Such severity levels are often used inaccurately, however, because a developer rarely has complete knowledge of how the code will ultimately be used.Furthermore, red-herring messages (e.g., "no error detected") may pollute the result set with
Compression enables us to shift the computation load from IO to CPU. In modern datacenters where energy efficiency is a growing concern, the benefits of using compression have not been completely exploited. We develop a decision algorithm that helps MapReduce users identify when and where to use compression. For some jobs, using compression gives energy savings of up to 60%. As MapReduce represents a common computation framework for Internet datacenters, we believe our findings will provide signficant impact on improving datacenter energy efficiency.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.