Multi-agent reinforcement learning in games

Lu, Xiaohan

doi:10.22215/etd/2012-09679

Cited by 12 publications

(45 citation statements)

References 33 publications

(94 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…(x p , y p ) ← (0, 0) {pursuer initial position} 10: initialize (x e , y e ) randomly {evader initial position} 11: update s p = ( ,̇) 12: update s e = ( , d) 13: u ← Eq. (5.59) 17: play the game, observe the next states s ′ p and s ′ e and the reward r 18: end for 25: end for (5.59) 17: play the game, observe the next states s ′ p and s ′ e and the reward r 18: end for 25: end for…”

Section: Q( )-Learning Fuzzy Inference Systemmentioning

confidence: 99%

See 1 more Smart Citation

Differential Games

2014

Multi‐Agent Machine Learning

View full text Add to dashboard Cite

In the not too distant future, teams of robots will work together to accomplish a multitude of tasks. At the time of writing this book, we have seen the extensive use of aerial drones in surveillance, mapping, and other more unsavory tasks. We are also witnessing the beginning of truly autonomous vehicles for transportation. How long will it be before cars routinely drive themselves? We are currently on the verge of having multiple autonomous vehicles working together as some type of swarm. These groups of robots or autonomous vehicles will be a combination of aerial-, land-, and sea-based vehicles. These vehicles will have different configurations and capabilities. Unlike in the previous chapters, these vehicles will not be constrained to a grid, but, instead, they will be operating in a continuous and dynamically changing environment. The actions of these vehicles will be mathematically described by differential equations. The actions that the autonomous vehicles take will essentially and ultimately be control actions. These actions may be the setting of voltages on various actuators. We will refer to these types of systems as differential games (DGs).The goal of these types of agents is to learn how to work together and how to adapt to changes in their own or other robots' capabilities. For example, Multi-Agent Machine Learning: A Reinforcement Approach, First Edition. Howard M. Schwartz.

show abstract

Section: Q( )-Learning Fuzzy Inference Systemmentioning

confidence: 99%

“…33, point O on the invader's reachable region is the closest point to the 33. Reproduced from[18], © X. Lu. Reproduced from[18], © X. Lu.…”

mentioning

confidence: 99%

Differential Games

2014

Multi‐Agent Machine Learning

View full text Add to dashboard Cite

show abstract

“…Moreover, it is well known that fuzzy inference systems are widely used as function approximators [4], [5]. Reinforcement fuzzy learning methods have recently been proposed for the problem of learning in differential games [5]- [9] . In [5], only the consequent parameters of the FLC and fuzzy inference system (FIS) are tuned using a fuzzy actor-critic learning algorithm.…”

Section: Introductionmentioning

confidence: 99%

“…In addition the FIS is used as an approximation to the actionvalue function, Q(s,a). In [9], fuzzy actor-critic learning is applied to the guarding territory differential game. In this learning technique, the consequent parameters are tuned to allow the defender to learn its Nash equilibrium strategy.…”

Section: Introductionmentioning

confidence: 99%

An investigation of methods of parameter tuning for Q-Learning Fuzzy Inference System

Al-Talabi

Schwartz

2014

2014 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE)

View full text Add to dashboard Cite

This paper investigates four methods of implementing a Q-Learning Fuzzy Inference System(QFIS) algorithm to autonomously tune the parameters of a fuzzy inference system. We use an actor-critique structure and we simulate mobile robots playing the differential form of the pursuit evasion game. Both the critique and the actor are fuzzy inference systems. The four methods come from the fact whether it is necessary to tune all the parameters (i.e. all the premise and the consequent parameters) of the critique and the actor or just tune their consequent parameters. The four methods are applied to three versions of the pursuit evasion games. In the first version just the pursuer is learning. In the second version, the evader uses its higher maneuverability and plays intelligently against a self-learning pursuer. In the final version, both the pursuer and the evader are learning. We evaluate which parameters are best to tune and which parameters have little impact on the performance.

show abstract

“…Therefore, it is preferable to use these algorithms in an unsupervised learning manner. In [2], [8]- [12] , reinforcement learning (RL) methods have also been proposed for the problem of tuning the FLC parameters in an unsupervised manner.…”

Section: Introductionmentioning

confidence: 99%

A two stage learning technique using PSO-based FLC and QFIS for the pursuit evasion differential game

Al-Talabi

Schwartz

2014

2014 IEEE International Conference on Mechatronics and Automation

View full text Add to dashboard Cite

This paper presents a two stage learning technique that combines a particle swarm optimization (PSO)-based fuzzy logic control (FLC) algorithm with the Q-Learning fuzzy inference system (QFIS) algorithm. The PSO algorithm is used as a global optimizer to autonomously tune the parameters of a fuzzy logic controller. On the other hand, the QFIS algorithm is used as a local optimizer. We simulate mobile robots playing the differential form of the pursuit evasion game. The game is played such that the pursuer should learn its default control strategy on-line by interacting with the evader. We assume that the evader plays a well defined strategy which is to run away along the line of sight. The pursuer's learning process depends on the rewards received from its environment. The proposed technique is compared through simulation with the default control strategy, the PSO-based fuzzy logic control algorithm, and the QFIS algorithm. Simulation results show that the proposed learning technique outperform the PSO-based fuzzy logic control algorithm and the QFIS algorithm with respect to the learning time which represents an important factor in on-line applications.

show abstract

Multi-agent reinforcement learning in games

Cited by 12 publications

References 33 publications

Differential Games

Differential Games

An investigation of methods of parameter tuning for Q-Learning Fuzzy Inference System

A two stage learning technique using PSO-based FLC and QFIS for the pursuit evasion differential game

Contact Info

Product

Resources

About