2022
DOI: 10.48550/arxiv.2205.10316
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Seeking entropy: complex behavior from intrinsic motivation to occupy action-state path space

Abstract: Intrinsic motivation generates behaviors that do not necessarily lead to immediate reward, but help exploration and learning. Here we show that agents having the sole goal of maximizing occupancy of future actions and states, that is, moving and exploring on the long term, are capable of complex behavior without any reference to external rewards. We find that action-state path entropy is the only measure consistent with additivity and other intuitive properties of expected future action-state path occupancy. W… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
3
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(4 citation statements)
references
References 28 publications
1
3
0
Order By: Relevance
“…Another study that used a similar approach revealed that people prefer options that make more future options available, above and beyond utility concerns; and that the availability of options positively influenced participants' perceived freedom of choice [128]. This resonates with findings from a recent model based on an information theoretic approach [129]. This study shows that endowing agents with the tendency to maximize the foreseen occupancy of their future actions and states can account for a range of complex behaviors with no need to invoke reward seeking.…”
Section: The Study Of Rich Behavior In Psychologysupporting
confidence: 55%
“…Another study that used a similar approach revealed that people prefer options that make more future options available, above and beyond utility concerns; and that the availability of options positively influenced participants' perceived freedom of choice [128]. This resonates with findings from a recent model based on an information theoretic approach [129]. This study shows that endowing agents with the tendency to maximize the foreseen occupancy of their future actions and states can account for a range of complex behaviors with no need to invoke reward seeking.…”
Section: The Study Of Rich Behavior In Psychologysupporting
confidence: 55%
“…Individual features such as risk aversion (third term in Eq.1) have been demonstrated to influence the allocation of limited resources in various types of uncertain decisions (Chronopoulos et al, 2011;Dow & Werlang, 1992;Tulloch et al, 2015), suggesting that it might affect participants' propensity to skip a trial ( ) and leave it to chance. Finally, participants sampling behaviour may be driven by a tendency to occupy action-state space (maximum occupancy principle), compelling them to try out various resources allocations and gain a global understanding of the environment (Ramírez-Ruiz et al, 2022). To model this, we introduced an entropy term.…”
Section: Evidence For An Intentional Strategymentioning
confidence: 99%
“…Such behavioural tendencies contribute to reducing prediction errors and constructing a more accurate model of the environment, facilitating rapid adaptation to potential changes. Moreover, the maximum occupancy principle, which favours entropy seeking, constitutes the core objective of newer theoretical frameworks modelling behaviour, offering an alternative perspective to reward maximisation (Ramírez-Ruiz et al, 2022).…”
Section: Entropy Seekingmentioning
confidence: 99%
See 1 more Smart Citation