Edge computing responds to users' requests with low latency by storing the relevant files at the network edge. Various data deduplication technologies are currently employed at edge to eliminate redundant data chunks for space saving. However, the lookup for the global huge-volume fingerprint indexes imposed by detecting redundancies can significantly degrade the data processing performance. Besides, we envision a novel file storage strategy that realizes the following rationales simultaneously: 1) space efficiency, 2) access efficiency, and 3) load balance, while the existing methods fail to achieve them at one shot. To this end, we report LOFS, a Lightweight Online File Storage strategy, which aims at eliminating redundancies through maximizing the probability of successful data deduplication, while realizing the three design rationales simultaneously. LOFS leverages a lightweight three-layer hash mapping scheme to solve this problem with constant-time complexity. To be specific, LOFS employs the Bloom filter to generate a sketch for each file, and thereafter feeds the sketches to the Locality Sensitivity hash (LSH) such that similar files are likely to be projected nearby in LSH tablespace. At last, LOFS assigns the files to real-world edge servers with the joint consideration of the LSH load distribution and the edge server capacity. Trace-driven experiments show that LOFS closely tracks the global deduplication ratio and generates a relatively low load std compared with the comparison methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.