Robert Babuška scite author profile

Abstract-Multiagent systems are rapidly finding applications in a variety of domains, including robotics, distributed control, telecommunications, and economics. The complexity of many tasks arising in these domains makes them difficult to solve with preprogrammed agent behaviors. The agents must, instead, discover a solution on their own, using learning. A significant part of the research on multiagent learning concerns reinforcement learning techniques. This paper provides a comprehensive survey of multiagent reinforcement learning (MARL). A central issue in the field is the formal statement of the multiagent learning goal. Different viewpoints on this issue have led to the proposal of many different goals, among which two focal points can be distinguished: stability of the agents' learning dynamics, and adaptation to the changing behavior of the other agents. The MARL algorithms described in the literature aim-either explicitly or implicitly-at one of these two goals or at a combination of both, in a fully cooperative, fully competitive, or more general setting. A representative selection of these algorithms is discussed in detail in this paper, together with the specific issues that arise in each category. Additionally, the benefits and challenges of MARL are described along with some of the problem domains where the MARL techniques have been applied. Finally, an outlook for the field is provided.

show abstract

Fuzzy Modeling

Babuška¹

1998

207

418

View full text Add to dashboard Cite

Reinforcement Learning and Dynamic Programming Using Function Approximators

Buşoniu¹,

Babuška²,

Schutter³

et al. 2017

552

398

View full text Add to dashboard Cite

Control systems are making a tremendous impact on our society. Though invisible to most users, they are essential for the operation of nearly all devices -from basic home appliances to aircraft and nuclear power plants. Apart from technical systems, the principles of control are routinely applied and exploited in a variety of disciplines such as economics, medicine, social sciences, and artificial intelligence.A common denominator in the diverse applications of control is the need to influence or modify the behavior of dynamic systems to attain prespecified goals. One approach to achieve this is to assign a numerical performance index to each state trajectory of the system. The control problem is then solved by searching for a control policy that drives the system along trajectories corresponding to the best value of the performance index. This approach essentially reduces the problem of finding good control policies to the search for solutions of a mathematical optimization problem.Early work in the field of optimal control dates back to the 1940s with the pioneering research of Pontryagin and Bellman. Dynamic programming (DP), introduced by Bellman, is still among the state-of-the-art tools commonly used to solve optimal control problems when a system model is available. The alternative idea of finding a solution in the absence of a model was explored as early as the 1960s. In the 1980s, a revival of interest in this model-free paradigm led to the development of the field of reinforcement learning (RL). The central theme in RL research is the design of algorithms that learn control policies solely from the knowledge of transition samples or trajectories, which are collected beforehand or by online interaction with the system. Most approaches developed to tackle the RL problem are closely related to DP algorithms.A core obstacle in DP and RL is that solutions cannot be represented exactly for problems with large discrete state-action spaces or continuous spaces. Instead, compact representations relying on function approximators must be used. This challenge was already recognized while the first DP techniques were being developed. However, it has only been in recent years -and largely in correlation with the advance of RL -that approximation-based methods have grown in diversity, maturity, and efficiency, enabling RL and DP to scale up to realistic problems.This book provides an accessible in-depth treatment of reinforcement learning and dynamic programming methods using function approximators. We start with a concise introduction to classical DP and RL, in order to build the foundation for the remainder of the book. Next, we present an extensive review of state-of-the-art approaches to DP and RL with approximation. Theoretical guarantees are provided on the solutions obtained, and numerical examples and comparisons are used to illustrate the properties of the individual methods. The remaining three chapters are i ii dedicated to a detailed presentation of representative algorithms from the three major classes o...

show abstract

A Survey of Actor-Critic Reinforcement Learning: Standard and Natural Policy Gradients

Grondman

Buşoniu

Lopes

et al. 2012

IEEE Trans. Syst., Man, Cybern. C

841

374

View full text Add to dashboard Cite

Policy gradient based actor-critic algorithms are amongst the most popular algorithms in the reinforcement learning framework. Their advantage of being able to search for optimal policies using low-variance gradient estimates has made them useful in several real-life applications, such as robotics, power control and finance. Although general surveys on reinforcement learning techniques already exist, no survey is specifically dedicated to actor-critic algorithms in particular. This paper therefore describes the state of the art of actorcritic algorithms, with a focus on methods that can work in an online setting and use function approximation in order to deal with continuous state and action spaces. After starting with a discussion on the concepts of reinforcement learning and the origins of actor-critic algorithms, this paper describes the workings of the natural gradient, which has made its way into many actor-critic algorithms in the past few years. A review of several standard and natural actor-critic algorithms follows and the paper concludes with an overview of application areas and a discussion on open issues.

show abstract

Similarity measures in fuzzy rule base simplification

Setnes

Babuška

Kaymak

et al. 1998

IEEE Trans. Syst., Man, Cybern. B

427

242

View full text Add to dashboard Cite

Abstract-In fuzzy rule-based models acquired from numerical data, redundancy may be present in the form of similar fuzzy sets that represent compatible concepts. This results in an unnecessarily complex and less transparent linguistic description of the system. By using a measure of similarity, a rule base simplification method is proposed that reduces the number of fuzzy sets in the model. Similar fuzzy sets are merged to create a common fuzzy set to replace them in the rule base. If the redundancy in the model is high, merging similar fuzzy sets might result in equal rules that also can be merged, thereby reducing the number of rules as well. The simplified rule base is computationally more efficient and linguistically more tractable. The approach has been successfully applied to fuzzy models of real world systems.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Robert Babuška

A Comprehensive Survey of Multiagent Reinforcement Learning

Fuzzy Modeling

Reinforcement Learning and Dynamic Programming Using Function Approximators

A Survey of Actor-Critic Reinforcement Learning: Standard and Natural Policy Gradients

Similarity measures in fuzzy rule base simplification

Contact Info

Product

Resources

About