2010 Eleventh Brazilian Symposium on Neural Networks 2010
DOI: 10.1109/sbrn.2010.44
|View full text |Cite
|
Sign up to set email alerts
|

Reinforcement Learning for Controlling a Coupled Tank System Based on the Scheduling of Different Controllers

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
5
0

Year Published

2011
2011
2022
2022

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(5 citation statements)
references
References 3 publications
0
5
0
Order By: Relevance
“…In the RL domain, we will only provide our algorithms with a (reward function), which refers to a learning factor when it is performing well, and when it is performing poorly, the learning algorithm's task is to know how to choose actions over time to obtain big rewards. The goal of "RL" is to guide the agent to determine what action to take that maximizes (or minimizes) the sum of all RL signals (the numerical reward) or punishment, it receives over time, called the total expected reward [29]. According to Fig.…”
Section: Reinforcement Learning (Rl)mentioning
confidence: 99%
“…In the RL domain, we will only provide our algorithms with a (reward function), which refers to a learning factor when it is performing well, and when it is performing poorly, the learning algorithm's task is to know how to choose actions over time to obtain big rewards. The goal of "RL" is to guide the agent to determine what action to take that maximizes (or minimizes) the sum of all RL signals (the numerical reward) or punishment, it receives over time, called the total expected reward [29]. According to Fig.…”
Section: Reinforcement Learning (Rl)mentioning
confidence: 99%
“…Following this idea, [8] define reinforcement learning (RL) as learning what to do -as mapping situations to actions -in a way that maximizes a numerical reward.…”
Section: Reinforcement Learningmentioning
confidence: 99%
“…Hence, a policy , is the mapping from states to actions , taken from that state, and represents the probability of selecting each possible action, in such a way that the best actions correspond to the highest probability of choice [11]. [8] explains that to evaluate the quality of the actions taken by the agent can be applied the concept of the "actionvalue function for policy ", that represents an estimation of the total return expected, i. e., the quality of the action taken by the agent when it is following some policy . This function represents the value of the expected total return to the state (current state) when the action is chosen and it follows, from that state, the policy , as shown in (7).…”
Section: Reinforcement Learningmentioning
confidence: 99%
See 1 more Smart Citation
“…Non-linear model gives a more accurate prediction for a wider operating range of control [4] . The couple tank system considered in this study is a typical example of the plant with a high degree of non-linearity [26,27] . The non-linearity in the CTS is mainly due to the basic dynamic equations of the CTS, the characteristics of the valves and as a result of the nonlinear flow characteristics in the tank system [4] .…”
Section: Introductionmentioning
confidence: 99%