2019
DOI: 10.1016/j.biosystems.2019.02.009
|View full text |Cite
|
Sign up to set email alerts
|

Guaranteed satisficing and finite regret: Analysis of a cognitive satisficing value function

Abstract: As reinforcement learning algorithms are being applied to increasingly complicated and realistic tasks, it is becoming increasingly difficult to solve such problems within a practical time frame. Hence, we focus on a satisficing strategy that looks for an action whose value is above the aspiration level (analogous to the break-even point), rather than the optimal action. In this paper, we introduce a simple mathematical model called risk-sensitive satisficing (RS) that implements a satisficing strategy by inte… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
3
2
1

Relationship

1
5

Authors

Journals

citations
Cited by 8 publications
(5 citation statements)
references
References 26 publications
0
5
0
Order By: Relevance
“…We call an action whose value exceeds the aspiration level a satisfactory action. RS was originally derived as a generalization of the extremely symmetric form of the conditional probability (see Appendix of [7]). Further, it has the property that the direction is the risk consideration in the value evaluation determined by the action is satisfactory.…”
Section: Satisficing Algorithm: Rsmentioning
confidence: 99%
See 4 more Smart Citations
“…We call an action whose value exceeds the aspiration level a satisfactory action. RS was originally derived as a generalization of the extremely symmetric form of the conditional probability (see Appendix of [7]). Further, it has the property that the direction is the risk consideration in the value evaluation determined by the action is satisfactory.…”
Section: Satisficing Algorithm: Rsmentioning
confidence: 99%
“…where ℵ opt is the optimal aspiration level, p first is the largest reward distribution, and p second is the second largest reward distribution. RS can find the optimal action sequence with a small number of exploratory actions by setting the aspiration level in accordance with equation (4) [7]. This is because, when given an optimal aspiration level, the agent can only be satisfied with the action with the largest reward distribution (the optimal action).…”
Section: Satisficing Algorithm: Rsmentioning
confidence: 99%
See 3 more Smart Citations