Q-Table compression for reinforcement learning

Amado, Leonardo; Meneguzzi, Felipe

doi:10.1017/s0269888918000280

Cited by 6 publications

(7 citation statements)

References 20 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Each robot observes the environment locally and takes the action without relying on a centralized controller for synchronization between robots thus not having any direct information about other robot's intensions. However, it is a good practice to hugely accelerate the learning process, alleviate branching factor problem [35] and gain more confidence on learned policies visiting most of the states in state space several times by sharing the observations of all robots only in learning phase without harming the decentralization concept in exploitation phase. We should bear in mind that sharing the observations is not a necessary task for our method.…”

Section: B) Contributionmentioning

confidence: 99%

“…However, our method does not rely on coordination between agents and despite QMIX, learning phase and running phase both are done in a decentralized fashion. There are some other researches that try to compress the state space and reduce dimensionality [35], [45]. These researches utilize a method named deep auto-encoders for compressing the state space utilizing neural networks, mainly with the aim of gaining generalization ability rather than memory usage reduction.…”

Section: C) Related Workmentioning

confidence: 99%

“…[49] provides a compression strategy for reducing the size of the experience by using an encoder and decoder to shrink the size of the streaming data in order to save bandwidth and power consumption. For sharing the q-table [35], [50] utilize a normalization step for merging the q-table of each robot after each learning episode which requires extra processing and also extra memory for holding number of times that each agent visited each state, e.g., if we name this table as visit-table in python language, selecting float type (16bits) for q-table and int (12 bits) for visit-table, it increases the memory usage by 75% multiplied by the number of agents, making it even more costly and even impractical for systems with large agent count. On the other hand, it has the advantage of learning in real-time during each episode.…”

Section: C) Related Workmentioning

confidence: 99%

See 2 more Smart Citations

A Low-Cost Q-Learning-Based Approach to Handle Continuous Space Problems for Decentralized Multi-Agent Robot Navigation in Cluttered Environments

2022

View full text Add to dashboard Cite

This paper addresses the problem of navigating decentralized multi-agent systems in partially cluttered environments and proposes a new machine-learning-based approach to solve it. On the basis of this approach, a new robust and flexible Q-learning-based model is proposed to handle a continuous space problem. As in reinforcement learning (RL) algorithms, Q-learning does not require a model of the environment. Additionally, Q-Learning (QL) has the advantages of being fast and easy to design. However, one disadvantage of QL is that it needs a massive amount of memory, and it grows exponentially with each extra feature introduced to the state space. In this research, we introduce an agent-level decentralized collision avoidance low-cost model for solving a continuous space problem in partially cluttered environments, followed by introducing a method to merge non-overlapping QL features in order to reduce its size significantly by about 70% and make it possible to solve more complicated scenarios with the same memory size. Additionally, another method is proposed for minimizing the sensory data that is used by the controller. A combination of these methods is able to handle swarm navigation low memory cost with at least18 number of robots. These methods can also be adapted for deep q-learning architectures so as to increase their approximation performance and also decrease their learning time process. Experiments reveal that the proposed method also achieves a high degree of accuracy for multi-agent systems in complex scenarios.

show abstract

Section: B) Contributionmentioning

confidence: 99%

Section: C) Related Workmentioning

confidence: 99%

Section: C) Related Workmentioning

confidence: 99%

See 1 more Smart Citation

A Low-Cost Q-Learning-Based Approach to Handle Continuous Space Problems for Decentralized Multi-Agent Robot Navigation in Cluttered Environments

2022

View full text Add to dashboard Cite

show abstract

“…Artificial neural networks (ANNs) are a type of machine learning model made up of numerous nodes grouped in layers that compute an output depending on node activation mediated by weights in the connections between them. ANNs are capable of solving a variety of machine learning tasks, including classification, regression, and dimensionality reduction [44].…”

Section: Ae-based Deep Clusteringmentioning

confidence: 99%

Unsupervised Deep Learning: Taxonomy and algorithms

Chefrour

Souici-Meslati

2022

IJCAI

View full text Add to dashboard Cite

Clustering is a fundamental challenge in many data-driven application fields and machine learning techniques. The data distribution determines the quality of the outcomes, which has a significant impact on clustering performance. As a result, deep neural networks can be used to learn more accurate data representations for clustering. Many recent studies have focused on employing deep neural networks to develop a clustering-friendly representation, which has resulted in a significant improvement in clustering performance. We present a systematic survey of clustering with deep learning in this study. Then, a taxonomy of deep clustering is proposed, as well as some sample algorithms for our overview. Finally, we discuss some exciting future possibilities for clustering using deep learning and offer some remarks. Povzetek: Ta članek opisuje metode globokega združevanja v skupine in predlaga taksonomijo globokega združevanja v skupine.

show abstract

“…The fourth paper Q-table compression for reinforcement learning by Rosa Amado and Meneguzzi (2018) proposes a method to reduce the number of entries in a Q-value table by using a deep autoencoder. Multi-agent reinforcement learning where agents share experience updates is also applied to mitigate the large branching factors which are present when controlling teams of units in real-time strategy (RTS) games.…”

Section: Contents Of the Special Issuementioning

confidence: 99%