2016
DOI: 10.1504/ijbdi.2016.077358
|View full text |Cite
|
Sign up to set email alerts
|

Towards cost-effective and high-performance caching middleware for distributed systems

Abstract: One performance bottleneck of distributed systems lies on the hard disk drive (HDD) whose single read/write head has physical limitations to support concurrent I/Os. Although the solid-state drive (SSD) has been introduced for years, HDDs are still dominant storage due to large capacity and low cost. This paper proposes a caching middleware that manages the underlying heterogeneous storage devices in order to allow distributed file systems to achieve both high performance and low cost. Specifically, we design … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
6
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 11 publications
(6 citation statements)
references
References 58 publications
0
6
0
Order By: Relevance
“…In the former, there is no support for automatically caching or uncaching files, while in the latter, there is support for a limited number of static policies for storing files on specific tiers [1]. A caching middleware has been proposed for using local SSDs as a read-only cache of local HDDs in HDFS [55], which uses a heuristic file-placement algorithm to improve the cache but assumes that a queue of requested files is known in advanced. hatS [31] and OctopusFS [29] extended HDFS to support fine-grained storage tiering based on which file blocks are replicated and stored across both the cluster nodes and the storage tiers (see Figure 1(b)).…”
Section: Distributed File Systems and Tieringmentioning
confidence: 99%
“…In the former, there is no support for automatically caching or uncaching files, while in the latter, there is support for a limited number of static policies for storing files on specific tiers [1]. A caching middleware has been proposed for using local SSDs as a read-only cache of local HDDs in HDFS [55], which uses a heuristic file-placement algorithm to improve the cache but assumes that a queue of requested files is known in advanced. hatS [31] and OctopusFS [29] extended HDFS to support fine-grained storage tiering based on which file blocks are replicated and stored across both the cluster nodes and the storage tiers (see Figure 1(b)).…”
Section: Distributed File Systems and Tieringmentioning
confidence: 99%
“…Upon arrival of a new write request in a read-only cache architecture [34,55,10,68,65,39] where the accessing block is not located in SSD, the request is completed by successfully recording it to HDD via 8 . When it was already cached in SSD for priority read operations, the request is considered as completed only after updating the HDD copy of data and discarding the SSD copy, successfully.…”
Section: Ssd As a Read-only Cachementioning
confidence: 99%
“…In this type of storage architectures [42,43,53,66], when a new write request arrives and its accessing data is not located in SSD, the data needs to be recorded to HDD and the request is completed only when the recording is successful. If the access data is available in SSD, the data in HDD needs to be updated, and the data in SSD may be discarded, or updated as well.…”
Section: Ssd As a Read-only Cachementioning
confidence: 99%
“…To reduce the unnecessary write operations, [42] proposes a method to check the data hotness based on the demotion counter and the proposed control metric, and migrate the hot data blocks to SSD. In [66], a heuristic file-placement algorithm is designed to improve the cache performance by considering the IO patterns of the incoming workloads. Meanwhile, a distributed caching middleware is implemented at user level to detect and manipulate the frequently accessed data.…”
Section: Ssd As a Read-only Cachementioning
confidence: 99%
See 1 more Smart Citation