Modern web applications are deployed in cloud computing systems because they support unlimited storage and computing power. One of the main back-end storage components of this cloud computing system is the distributed file system which allows massive amounts of data to be stored and accessed. In most web applications deployed in such systems, read operations are performed more frequently than write operations. Consequently, increasing the efficiency of read operations in distributed file systems is a challenging and important research problem. The two main procedures used in distributed file systems to improve the performance of read operations are prefetching and caching. In this paper, we proposed novel prefetching and multi-level caching algorithms based on the Access-Frequency and Access-Recency ranking of file blocks that were previously accessed by client application programs. We also proposed new augmented ranking algorithms for prefetching file blocks by combining the Access-Frequency and Access-Recency ranking of the file blocks. We used rank-based replacement algorithms to replace file blocks in the cache. The simulation results show that, the proposed algorithms improve the performance of read operations on distributed file systems by 29% to 77% in comparison to algorithms proposed in the literature.
Data-intensive applications are generating massive amounts of data which is stored on cloud computing platforms where distributed file systems are utilized for storage at the back end. Most users of those applications deployed on cloud computing systems read data more often than they write. Hence, enhancing the performance of read operations is an important research issue. Prefetching and caching are used as important techniques in the context of distributed file systems to improve the performance of read operations. In this research, we introduced a novel highly relevant frequent patterns (HRFP)-based algorithm that prefetches content from the distributed file system environment and stores it in the client-side caches that are present in the same environment. We have also introduced a new replacement policy and an efficient migration technique for moving the patterns from the main memory caches to the caches present in the solid-state devices based on a new metric namely the relevancy of the patterns. According to the simulation results, the proposed approach outperformed other algorithms that have been suggested in the literature by a minimum of 15% and a maximum of 53%.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.