Proceedings of the 24th International Conference on Machine Learning 2007
DOI: 10.1145/1273496.1273606
|View full text |Cite
|
Sign up to set email alerts
|

On the role of tracking in stationary environments

Abstract: It is often thought that learning algorithms that track the best solution, as opposed to converging to it, are important only on nonstationary problems. We present three results suggesting that this is not so. First we illustrate in a simple concrete example, the Black and White problem, that tracking can perform better than any converging algorithm on a stationary problem. Second, we show the same point on a larger, more realistic problem, an application of temporaldifference learning to computer Go. Our thir… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
34
0
1

Year Published

2010
2010
2017
2017

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 45 publications
(35 citation statements)
references
References 10 publications
0
34
0
1
Order By: Relevance
“…Even if this were possible, though, the fitness of the entire lifetime is the most important factor, and this usually depends on learning efficiency more than the asymptotic result. Sutton et al [48] make related observations about the limitations of asymptotic optimality.…”
Section: Relation To Other Researchmentioning
confidence: 99%
“…Even if this were possible, though, the fitness of the entire lifetime is the most important factor, and this usually depends on learning efficiency more than the asymptotic result. Sutton et al [48] make related observations about the limitations of asymptotic optimality.…”
Section: Relation To Other Researchmentioning
confidence: 99%
“…Li's KIMEL algorithm transforms the nonlinear input data with a kernel into a high-dimensional but linear feature space where linear IDBD is applied. Sutton and Koop [15], [16] developed another nice nonlinear extension IDBD-nl of the original IDBD algorithm using the logistic sigmoid function. It was applied for learning the game Go.…”
Section: Related Workmentioning
confidence: 99%
“…For all other algorithms we use this sigmoid function. An exception is Koop's IDBD-nl [15], [16] being derived for the logistic sigmoid function, which we use in this case instead of tanh. …”
Section: Other Algorithmsmentioning
confidence: 99%
“…As the dynamics of a robot can change due to many external factors ranging from temperature to wear, the learning process may never fully converge, i.e., it needs a "tracking solution" [Sutton et al, 2007]. Frequently, the environment settings during an earlier learning period cannot be reproduced and the external factors are not clear, e.g., how the light conditions affect the performance of the vision system and, as a result, the task's performance.…”
Section: Curse Of Real-world Samplesmentioning
confidence: 99%