Fuzzy inference system learning by reinforcement methods

Jouffe, Lionel

doi:10.1109/5326.704563

Cited by 327 publications

(224 citation statements)

References 39 publications

Supporting

Mentioning

216

Contrasting

Unclassified

Order By: Relevance

“…This section presents the FQL algorithm as given in [25]. Let the state vector s t = s t 1 , ...., s t j , ....., s t J , where j is the j th element of state vector before fuzzification.…”

Section: Fql Algorithm Descriptionmentioning

confidence: 99%

Self Organizing Networks: A Reinforcement Learning approach for self-optimization of LTE Mobility parameters

Tiwana

2014

Automatika

View full text Add to dashboard Cite

Original scientific paperWith the evolution of broadband mobile networks towards LTE and beyond, the support for the Internet and Internet based services is growing. Self Organizing Network (SON) functionalities intend to optimize the network performance for the improved user experience while at the same time reducing the network operational cost. This paper proposes a Reinforcement Learning (RL) based framework to improve throughput of the mobile users. The problem of spectral efficiency maximization is modeled as co-operative Multi-Agent control problem between the neighbouring eNodeBs (eNBs). Each eNB has an associated agent that dynamically changes the outgoing Handover Margin (HM) to its neighbouring cells. The agent uses the RL technique of Fuzzy Q-Learning (FQL) to learn the optimal mobility parameter i.e., HM value. The learning framework is designed to operate in an environment with the variations in traffic, user positions and propagation conditions. Simulation results have shown the proposed approach improves the network capacity and user experiences in terms of throughput. Key words: Handover Margin, LTE, Reinforcement Learning, Fuzzy Q-Learning, SONSamoorganizirajuće mreže: Podržano učenje za optimizaciju LTE mobilnosti. Razvoj širokopojasne mobilne mreže prema LTE mrežama uvjetuje pojačani rast internetskih servisa i usluga. Samoorganizirajuće mreže namijenjene su optimizaciji performansi mreže s ciljem poboljšanja korisnikovog zadovoljstva i smanjenja troškova rada. U radu se predlaže pristup zasnovan na podržanom učenju kako bi se popravila propusnost mobilnog korisnika. Problem maksimizacije spektralne učinkovitosti modelira se kao kooperativni više agentski problem upravljanje izme u susjednihčvorova (eNBs). Svakičvor ima pridruženog agenta koji dinamički mijenja marginu primopredaje prema susjednimćelijama. Agent koristi tehniku neizrazitog Q učenja (FQL) kako bi naučio optimizirati parametre mreže. Učenje je organizirano za rad u uvjetima raznovrsnog prometa, korisničkih položaja i uvjeta propagacije. Simulacijski rezultati pokazuju kako predloženi pristup poboljšava kapacitet mreže i korisnički doživljaj u smislu propusnosti mreže.

show abstract

“…This section presents the FQL algorithm as given in [25]. Let the state vector s t = s t 1 , ...., s t j , ....., s t J , where j is the j th element of state vector before fuzzification.…”

Section: Fql Algorithm Descriptionmentioning

confidence: 99%

Self Organizing Networks: A Reinforcement Learning approach for self-optimization of LTE Mobility parameters

Tiwana

2014

Automatika

View full text Add to dashboard Cite

show abstract

“…Fuzzy approximators have typically been used in modelfree (RL) techniques such as Q-learning [13,15,17] and actor-critic algorithms [2,20]. Most of these approaches are heuristic in nature, and their theoretical properties have not been investigated yet.…”

Section: Related Workmentioning

confidence: 99%

“…The first term in the sum depends linearly on ε ′ Q , which is related to the accuracy of the fuzzy approximator, and is more difficult to control. This ε ′ Q -dependent term also contributes to the suboptimality of the asymptotic solutions (17), (19). Ideally, one can find ε ′ Q = min Q∈ Q Q * − Q ∞ , which provides the smallest upper bounds in (17)-(20).…”

Section: Theorem 3 (Near-optimality) Denote the Set Of Qfunctions Repmentioning

confidence: 99%

Approximate dynamic programming with a fuzzy parameterization

et al. 2010

View full text Add to dashboard Cite

Dynamic programming (DP) is a powerful paradigm for general, nonlinear optimal control. Computing exact DP solutions is in general only possible when the process states and the control actions take values in a small discrete set. In practice, it is necessary to approximate the solutions. Therefore, we propose an algorithm for approximate DP that relies on a fuzzy partition of the state space, and on a discretization of the action space. This fuzzy Q-iteration algorithm works for deterministic processes, under the discounted return criterion. We prove that fuzzy Q-iteration asymptotically converges to a solution that lies within a bound of the optimal solution. A bound on the suboptimality of the solution obtained in a finite number of iterations is also derived. Under continuity assumptions on the dynamics and on the reward function, we show that fuzzy Q-iteration is consistent, i.e., that it asymptotically obtains the optimal solution as the approximation accuracy increases. These properties hold both when the parameters of the approximator are updated in a synchronous fashion, and when they are updated asynchronously. The asynchronous algorithm is proven to converge at least as fast as the synchronous one. The performance of fuzzy Q-iteration is illustrated in a two-link manipulator control problem.

show abstract

“…2). The Fuzzy net has an RBF-like architecture and powerful ability to classify continuous input and give continuous output (Jouffe, 1998). It has been successfully adopted in cart-pole balance control (Wang, X. S., et al, 2007), adaptive behavior learning of autonomous robots (Samejima and Omori, 1999), (Perez- Fig.1.…”

Section: Fuzzy Netmentioning

confidence: 99%

Adaptive swarm behavior acquisition by a neuro‐fuzzy system and reinforcement learning algorithm

Kuremoto

Obayashi

Kobayashi

2009

International Journal of Intelligent Computing and Cybernetics

View full text Add to dashboard Cite

Purpose -A neuro-fuzzy system with a reinforcement learning algorithm (RL) for adaptive swarm behaviors acquisition is presented. The basic idea is that each individual (agent) has the same internal model and the same learning procedure, and the adaptive behaviors are acquired only by the reward or punishment from the environment. The formation of the swarm is also designed by RL, e.g., TD-error learning algorithm, and it may bring out a faster exploration procedure comparing with the case of individual learning. Design/Methodology/Approach -The internal model of each individual composes a part of input states classification by a fuzzy net, and a part of optimal behavior learning network which adopting a kind of reinforcement learning methodology named actor-critic method. The membership functions and fuzzy rules in the fuzzy net are adaptively formed online by the change of environment states observed in the trials of agent's behaviors. The weights of connections between the fuzzy net and the action-value functions of Actor which provides a stochastic policy of action selection, and Critic which provides an evaluation to state transmission, are modified by temporal difference error (TD-error). Findings -Simulation experiments of the proposed system with several goal-directed navigation problems were accomplished and the results showed that swarms were successfully formed and optimized routes were found by swarm learning faster than the case of individual learning. Originality/value -Two techniques i.e. fuzzy identification system and reinforcement learning algorithm are fused into an internal model of the individuals for swarm formation and adaptive behavior acquisition. The proposed model may be applied to multi-agent systems, swarm robotics, metaheuristic optimization and so on.

show abstract

Fuzzy inference system learning by reinforcement methods

Cited by 327 publications

References 39 publications

Self Organizing Networks: A Reinforcement Learning approach for self-optimization of LTE Mobility parameters

Self Organizing Networks: A Reinforcement Learning approach for self-optimization of LTE Mobility parameters

Approximate dynamic programming with a fuzzy parameterization

Adaptive swarm behavior acquisition by a neuro‐fuzzy system and reinforcement learning algorithm

Contact Info

Product

Resources

About