2014
DOI: 10.1007/s10458-014-9257-1
|View full text |Cite
|
Sign up to set email alerts
|

On environment difficulty and discriminating power

Abstract: This paper presents a way to estimate the difficulty and discriminating power of any task instance. We focus on a very general setting for tasks: interactive (possibly multi-agent) environments where an agent acts upon observations and rewards. Instead of analysing the complexity of the environment, the state space or the actions that are performed by the agent, we analyse the performance of a population of agent policies against the task, leading to a distribution that is examined in terms of policy complexit… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
16
0

Year Published

2015
2015
2020
2020

Publication Types

Select...
4
2
1

Relationship

4
3

Authors

Journals

citations
Cited by 12 publications
(16 citation statements)
references
References 45 publications
0
16
0
Order By: Relevance
“…We set tasks and agents as asynchronous interactive systems, where difficulty is seen as computational steps of a Levin search, but this search has to be modified to cover stochastic behaviours. These ideas are an evolution and continuation of early notions of task and difficulty in [8] and [6] respectively. The relevance of verification in difficulty has usually been associated with deduction.…”
Section: Difficulty As Levin Search With Stochastic Verificationmentioning
confidence: 93%
“…We set tasks and agents as asynchronous interactive systems, where difficulty is seen as computational steps of a Levin search, but this search has to be modified to cover stochastic behaviours. These ideas are an evolution and continuation of early notions of task and difficulty in [8] and [6] respectively. The relevance of verification in difficulty has usually been associated with deduction.…”
Section: Difficulty As Levin Search With Stochastic Verificationmentioning
confidence: 93%
“…[43]), one simple notion that accounts for this concept quite well is the variance of results. In order to formalise this notion of variance (or number of different values) of the expected result of the set of evaluated agents, we can just compare pairs of values as follows:…”
Section: Fine and Coarse Discriminationmentioning
confidence: 99%
“…Every test should provide which level of difficulty it is evaluating. The difficulty of the environment could be calculated as, for example, the performance of a distribution of agents' policies with different levels of complexity, as presented in [43]. We postulated that the level of difficulty should be determined by the agents included in the environment (and their intelligence), the partition of agent slots that determines how teams are formed and the environment where the agent is evaluated.…”
Section: Our Definition Uses Various Sets and Weights In The Formula mentioning
confidence: 99%
“…In fact, a league may be redundant (for the same reasons why the information-driven or difficultydriven sampling are introduced) and other tournament arrangements are more effective with almost the same robustness and much fewer matches. q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 100 104 185 161 167 180 224 194 174 155 118 106 58 40 23 8 2 1 100 1 2 5 16 39 79 109 123 135 112 101 57 40 23 8 q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 100 106 193 171 172 178 198 187 195 158 120 89 70 39 16 5 3 100 1 2 5 13 39 72 114 152 143 113 87 68 39 16 5 3 Figure 4: We show the distributions of reward (roughly corresponding to R in this paper) for different configurations for the multi-agent system SCMAS introduced in [70]. Left: the plot shows the results when we confront each of the 2,000 policies with 50 different teams of competitors (with different seeds for the generator also).…”
Section: Evaluation By Peer Confrontationmentioning
confidence: 99%