The use of distributed key-value stores (KVS) has experienced fast adoption by various applications in recent years due to key advantages such as hypertext transfer protocol-based RESTful application programming interface, high availability and elasticity. Due to great scalability characteristics, KVS systems commonly use consistent hashing as data placement mechanism. Although KVS systems offer many advantages, they were not designed to dynamically adapt to changing workloads which often include data access skew. Furthermore, the underlying physical storage nodes may be heterogeneous and do not expose their performance capabilities to higher level data placement layers. In this paper, we address those issues and propose an essential step toward a dynamic autonomous solution by leveraging deep reinforcement learning.We design a self-learning approach that incrementally changes the data placement, improving the load balancing. Our approach is dynamic in the sense that is capable of avoiding hot spots, that is, overloaded storage nodes when facing different workloads.Also, we design our solution to be pluggable. It assumes no previous knowledge of the storage nodes capabilities, thus different KVS deployments may make use of it. Our experiments show that our method performs well on changing workloads including data access skew aspects. We demonstrate the effectiveness of our approach through experiments in a distributed KVS deployment.
K E Y W O R D Sconsistent hashing, deep reinforcement learning, key-value stores, load balancing, replica placement
INTRODUCTIONDistributed key-value stores (KVS) are a well-established approach for cloud data-intensive applications 1 mainly because they are capable of successfully managing huge data traffic driven by the explosive growth of different applications such as social networks, e-commerce, and enterprise.In this work, the focus is on a particular type of KVS, also known as Object Store, which can store and serve any type of data (eg, photo, image, and video). 2 Object Store such as Dynamo 3 and OpenStack-Swift 4 have become widely accepted due to its scalability, high capacity, cost-effective storage and reliable REST programming interface. These systems take advantage of peer-to-peer architecture 5 and replication techniques in order to guarantee high availability and scalability. The data placement of KVS systems are commonly based on a distributed hash table (DHT) and consistent hashing (CHT) 6 with virtual nodes. Consistent hashing is a type of hashing that minimizes the amount of data that needs to move when adding or removing storage nodes. Using only the hash of the id of the data one can determine exactly where that data should be. This mapping of hashes to locations is usually known as "ring. " While this strategy provides item-balancing guarantees, it may be not very efficient in balancing the actual workload of the system for the following reasons.Concurrency Computat Pract Exper. 2020;32:e5675. wileyonlinelibrary.com/journal/cpe