Decentralized Indirect Methods for Learning Automata Games

Tilak, Omkar; Martin, Ryan; Mukhopadhyay, Snehasis

doi:10.1109/tsmcb.2011.2118749

Cited by 27 publications

(19 citation statements)

References 35 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…LA selects the optimal one through interacting with an environment that provides an appropriate response (reward or penalty), which is used to update the selection probability vector of actions, as described in Algorithm 1. In recent years, LAs have been successfully applied to systems that possess incomplete knowledge about the environment, such as game playing [7], decision analysis [13], multiconstraint assignment [6], complete L-fuzzy matrix [14], and object partitioning [15]. Furthermore, LAs have been used to solve behavior recognition problems [16], stabilize distributed queuing systems [17] and continuous function optimization [18].…”

Section: A La and Continuous Pursuit Reward-inaction (Cp Ri ) Algorithmmentioning

confidence: 99%

See 1 more Smart Citation

Swarm intelligence to coordinate decentralized learning automata in identical payoff games

Zhang

et al. 2015

2015 IEEE Congress on Evolutionary Computation (CEC)

View full text Add to dashboard Cite

Decentralized learning automata (DLA) consist of a large number of learning automata (LAs), which learn independently without information exchange among them. However, although these LAs are able to reach Nash equilibrium theoretically, their learning efficiencies are weakened drastically since each LA works under nonstationary environment instead of stationary environment. In order to coordinate different LAs in DLA, swarm intelligence is an appropriate mechanism that can help all LAs to make swarm decision. As a representative of swarm intelligence algorithms, particle swarm optimization (PSO) can complete this goal. PSO derives from simulating the behavior of flying birds and has shown effective swarm intelligence by cooperation among the particles. It is very easy to implement and owns very little computational and memory overhead. This work utilizes PSO to provide an appropriate coordination mechanism in DLA through swarm intelligence, called learning automata swarm optimization (LASO). The PSO's swarm best (gbest) is treated as swarm decision, which is used by LASO as "optimal" estimator information, then all LAs utilize it to update their selection probability vectors. Next, the proposed LASO is applied in identical payoff games and constructs a model to solve them effectively. In our model, a PSO with equal resampling and top-N additional re-evaluations (PSO-ERN) algorithm is used to estimate the matrix of game's reward probabilities since its stochastic causes multiple estimations of the same play to result in different environmental responses. Computational experiments in identical payoff games demonstrate the fast and accurate convergence of LASO over the existing DLA.

show abstract

Section: A La and Continuous Pursuit Reward-inaction (Cp Ri ) Algorithmmentioning

confidence: 99%

“…Each LA is responsible for learning a variable, such as a high dimensional parameter or a task in scheduling problem, etc. In order to supply a learning model that consists of many LAs that learn at the same time, decentralized learning automata (DLA) [6] [7] are presented. In general, those LAs in DLA are VSSA.…”

Section: Introductionmentioning

confidence: 99%

Swarm intelligence to coordinate decentralized learning automata in identical payoff games

Zhang

et al. 2015

2015 IEEE Congress on Evolutionary Computation (CEC)

View full text Add to dashboard Cite

show abstract

“…We use the Decentralized Pursuit Learning game Algorithm (DPLA) [13]. In the DPLA, each participating automata maintains a vectorˆ ( ) where k refers to the current trial and 1 ≤ ≤ is the automaton index.…”

Section: Decentralized Pursuit Learning Game Algorithmmentioning

confidence: 99%

Decentralized and partially decentralized reinforcement learning for designing a distributed wetland system in watersheds

Tilak

Babbar-Sebens

Mukhopadhyay

2011

2011 IEEE International Conference on Systems, Man, and Cybernetics

View full text Add to dashboard Cite

In this paper, we use identical-payoff games of reinforcement learning agents as a framework to solve complex multi-criteria optimization problem of watershed management. Multiple analytical criteria are used to assess design decisions for creating a distributed network of wetlands in the watershed. Decentralized game algorithms of reinforcement learning agents as well as a genetic algorithm based method are used for the analysis. Simulation studies are presented which compare the efficiency of the reinforcement learning approaches with a multiobjective genetic algorithm-based approach.

show abstract

“…the adaption to the environment. LAs have found a broad range of applications reported in the literature, such as game playing [2]- [4], pattern recognition [5], classification [6], knapsack problem [7], tutorial-like system [8], [9], object partitioning [10]- [13], cellular automata [14], telephony routing [15], [16], scheduling [17], minimum-spanning circle problem [18], congestion avoidance [19], function optimization [20], [21], resource allocation and assignment problems [22]- [24], automaton controller [25], control absorption columns, flexible manufacturing plants, and other applications such as dryers, vehicles, irrigation canals, multimedia network, robots, liquid-liquid extraction columns, bioreactors, distributed fuzzy logic processors, image processing, and data compression. Various LAs, their properties and applications have been reviewed in survey papers [26], [27] and books [28]- [32].…”

Section: Introductionmentioning

confidence: 99%

Fast and Epsilon-Optimal Discretized Pursuit Learning Automata

Zhang

Wang

Zhou

2015

IEEE Trans. Cybern.

View full text Add to dashboard Cite

Learning automata (LA) are powerful tools for reinforcement learning. A discretized pursuit LA is the most popular one among them. During an iteration its operation consists of three basic phases: 1) selecting the next action; 2) finding the optimal estimated action; and 3) updating the state probability. However, when the number of actions is large, the learning becomes extremely slow because there are too many updates to be made at each iteration. The increased updates are mostly from phases 1 and 3. A new fast discretized pursuit LA with assured ε -optimality is proposed to perform both phases 1 and 3 with the computational complexity independent of the number of actions. Apart from its low computational complexity, it achieves faster convergence speed than the classical one when operating in stationary environments. This paper can promote the applications of LA toward the large-scale-action oriented area that requires efficient reinforcement learning tools with assured ε -optimality, fast convergence speed, and low computational complexity for each iteration.

show abstract

Decentralized Indirect Methods for Learning Automata Games

Cited by 27 publications

References 35 publications

Swarm intelligence to coordinate decentralized learning automata in identical payoff games

Swarm intelligence to coordinate decentralized learning automata in identical payoff games

Decentralized and partially decentralized reinforcement learning for designing a distributed wetland system in watersheds

Fast and Epsilon-Optimal Discretized Pursuit Learning Automata

Contact Info

Product

Resources

About