Abstract:SUMMARYWe present a power-saving method for large-scale storage systems of cloud data sharing services, particularly those providing media (video and photograph) sharing services. The idea behind our method is to periodically rearrange stored data in a disk array, so that the workload is skewed toward a small subset of disks, while other disks can be sent to standby mode. This idea is borrowed from the Popular Data Concentration (PDC) technique, but to avoid an increase in response time caused by the accesses … Show more
“…'s Hadoop cluster, GreenHDFS [7] allocates the disks either hot or cold zone, and replace the data according to the age of data. Hasebe et al [8] used file access traces observed in Flickr to derive the individual file access frequencies and proposed the file exchange algorithm to skew the disk access frequencies. Some recent studies further took into account the correlation of file accesses to determine the file placement for energy saving [36] [37].…”
Section: Related Workmentioning
confidence: 99%
“…Such web access patterns are commonly observed in other web services and are often approximated by Zipf distributions [40] [41]. In our experimental study in Section 6, we also use the access traces of Flickr investigated in the previous studies [6] [8].…”
Section: Workload Characteristicsmentioning
confidence: 99%
“…While Poisson arrivals were observed in real systems [45] [46], the validity of the first assumption A1 on the constant file access rate is arguable, since the popularity of sharing files can change over time as observed in real traces [8]. The solution to the energy-saving file placement problem only gives the best placement under the predicted file access rates for the next period.…”
Section: Energy-saving File Placement Problem Given a Set Of Files N mentioning
confidence: 99%
“…It also clarifies that the necessary condition where the heuristic method achieves the optimal file placement satisfying the performance constraint. To confirm the effectiveness of the optimal file placement on a real storage system, we conducted experiments on our testbed system consisting of four disks with the real file access frequencies obtained from Flickr file access traces [8]. Our experimental results show the optimal file placement can cut 31.8% of energy overheads steadily compared with the baseline file placement in which file accesses are evenly distributed across the hard disks.…”
mentioning
confidence: 93%
“…Two essential functions in the implementation of PDC are the prediction of data access frequencies and the data placement method. While the prediction is usually made by analyzing historical workload data, the placement method typically relies on a heuristic algorithm that places data on disks in order of data access frequencies [5] [6] [8] [12]. The heuristic is intuitive, and the effectiveness was validated through some simulation studies [5] [6] [8].…”
“…'s Hadoop cluster, GreenHDFS [7] allocates the disks either hot or cold zone, and replace the data according to the age of data. Hasebe et al [8] used file access traces observed in Flickr to derive the individual file access frequencies and proposed the file exchange algorithm to skew the disk access frequencies. Some recent studies further took into account the correlation of file accesses to determine the file placement for energy saving [36] [37].…”
Section: Related Workmentioning
confidence: 99%
“…Such web access patterns are commonly observed in other web services and are often approximated by Zipf distributions [40] [41]. In our experimental study in Section 6, we also use the access traces of Flickr investigated in the previous studies [6] [8].…”
Section: Workload Characteristicsmentioning
confidence: 99%
“…While Poisson arrivals were observed in real systems [45] [46], the validity of the first assumption A1 on the constant file access rate is arguable, since the popularity of sharing files can change over time as observed in real traces [8]. The solution to the energy-saving file placement problem only gives the best placement under the predicted file access rates for the next period.…”
Section: Energy-saving File Placement Problem Given a Set Of Files N mentioning
confidence: 99%
“…It also clarifies that the necessary condition where the heuristic method achieves the optimal file placement satisfying the performance constraint. To confirm the effectiveness of the optimal file placement on a real storage system, we conducted experiments on our testbed system consisting of four disks with the real file access frequencies obtained from Flickr file access traces [8]. Our experimental results show the optimal file placement can cut 31.8% of energy overheads steadily compared with the baseline file placement in which file accesses are evenly distributed across the hard disks.…”
mentioning
confidence: 93%
“…Two essential functions in the implementation of PDC are the prediction of data access frequencies and the data placement method. While the prediction is usually made by analyzing historical workload data, the placement method typically relies on a heuristic algorithm that places data on disks in order of data access frequencies [5] [6] [8] [12]. The heuristic is intuitive, and the effectiveness was validated through some simulation studies [5] [6] [8].…”
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.