Abstract. Designs of adaptive fuzzy controllers (AFC) are commonly based on the Lyapunov approach, which requires a known model of the controlled plant. They need to consider a Lyapunov function candidate as an evaluation function to be minimized. In this study these drawbacks were handled by designing a model-free adaptive fuzzy controller (MFAFC) using an approximate evaluation function defined in terms of the current state, the next state, and the control action. MFAFC considers the approximate evaluation function as an evaluative control performance measure similar to the state-action value function in reinforcement learning. The simulation results of applying MFAFC to the inverted pendulum benchmark verified the proposed scheme's efficacy.
Keywords: adaptive fuzzy control; evaluation function; Lyapunov approach; modelfree adaptive control; reinforcement learning.
IntroductionMany designs of adaptive fuzzy controllers (AFC) are based on the Lyapunov approach, assuming the existence of a Lyapunov function for the control problem to be solved. The Lyapunov function getting smaller is seen as an indication of better control performance. The most important stage in the Lyapunov approach is to ensure that the first derivative of the Lyapunov function candidate (LFC) is either negative-definite or semi-definite [1][2][3][4]. This stage requires a known plant model or, at least, a known plant model structure.The regular approach in model-based design of AFCs is that the engineer a priori defines the LFC simply in terms of the distance between the current state and the goal state. A smaller error is always assumed to indicate better control performance. However, in many control problems a smaller error does not indicate better control performance, i.e. it is not always evaluative. Thus, action selection based on smaller errors can misdirect and lead to non-goal states. Generally speaking, errors are more instructional than evaluative [5]. For this reason, they are not universally suitable as an evaluative measure of control performance. This is a flaw in model-based design of AFCs.