Most of the caching algorithms are oblivious to requests' timescale, but caching systems are capacity constrained and, in practical cases, the hit rate may be limited by the cache's impossibility to serve requests fast enough. In particular, the hard-disk access time can be the key factor capping cache performance. In this paper, we present a new cache replacement policy that takes advantage of a hierarchical caching architecture, and in particular of access-time difference between memory and disk. Our policy is optimal when requests follow the independent reference model, and significantly reduces the hard-disk load, as shown also by our realistic, trace-driven evaluation. Moreover, we show that our policy can be considered in a more general context, since it can be easily adapted to minimize any retrieval cost, as far as costs add over cache misses.
Cache policies to minimize the content retrieval cost have been studied through competitive analysis when the miss costs are additive and the sequence of content requests is arbitrary. More recently, a cache utility maximization problem has been introduced, where contents have stationary popularities and utilities are strictly concave in the hit rates. This paper bridges the two formulations, considering linear costs and content popularities. We show that minimizing the retrieval cost corresponds to solving an online knapsack problem, and we propose new dynamic policies inspired by simulated annealing, including DynqLRU, a variant of qLRU. For such policies we prove asymptotic convergence to the optimum under the characteristic time approximation. In a real scenario, popularities vary over time and their estimation is very difficult. DynqLRU does not require popularity estimation, and our realistic, trace-driven evaluation shows that it significantly outperforms state-of-the-art policies, with up to 45% cost reduction. Notice: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible.
Many different Distributed Hash Tables (DHTs) have been designed, but only few have been successfully deployed. The implementation of a DHT needs to deal with practical aspects (e.g. related to churn, or to the delay) that are often only marginally considered in the design.In this paper, we analyze in detail the content retrieval process in KAD, the implementation of the DHT Kademlia that is part of several popular peer-to-peer clients. In particular, we present a simple model to evaluate the impact of different design parameters on the overall lookup latency. We then perform extensive measurements on the lookup performance using an instrumented client. From the analysis of the results, we propose an improved scheme that is able to significantly decrease the overall lookup latency without increasing the overhead.
Abstract-We study size-based schedulers, and focus on the impact of inaccurate job size information on response time and fairness. Our intent is to revisit previous results, which allude to performance degradation for even small errors on job size estimates, thus limiting the applicability of size-based schedulers.We show that scheduling performance is tightly connected to workload characteristics: in the absence of large skew in the job size distribution, even extremely imprecise estimates suffice to outperform size-oblivious disciplines. Instead, when job sizes are heavily skewed, known size-based disciplines suffer.In this context, we show -for the first time -the dichotomy of over-estimation versus under-estimation. The former is, in general, less problematic than the latter, as its effects are localized to individual jobs. Instead, under-estimation leads to severe problems that may affect a large number of jobs.We present an approach to mitigate these problems: our technique requires no complex modifications to original scheduling policies and performs very well. To support our claim, we proceed with a simulation-based evaluation that covers an unprecedented large parameter space, which takes into account a variety of synthetic and real workloads.As a consequence, we show that size-based scheduling is practical and outperforms alternatives in a wide array of usecases, even in presence of inaccurate size information.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.