Toward an ideal trainer

Epstein, Susan L.

doi:10.1007/bf00993346

Cited by 25 publications

(21 citation statements)

References 15 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The first problem is that it is likely to get stuck on a self-consistent but a non-optimal strategy [13]. Secondly, there is no guarantee that the portions of the strategy space searched are the most significant ones [14]. These problems are addressed by ensuring that the population diversity is adequate to avoid local minima and to cover a larger search space.…”

Section: Competitive Environmentsmentioning

confidence: 99%

Tournament Particle Swarm Optimization

Duminy

Engelbrecht

2007

2007 IEEE Symposium on Computational Intelligence and Games

View full text Add to dashboard Cite

Abstract-This paper introduces Tournament Particle Swarm Optimization (PSO) as a method to optimize weights of game tree evaluation functions in a competitive environment using Particle Swarm Optimization. This method makes use of tournaments to ensure a fair evaluation of the performance of particles in the swarm, relative to that of other particles. The empirical work presented compares the performance of different tournament methods that can be applied to the Tournament PSO, with application to Checkers.

show abstract

Section: Competitive Environmentsmentioning

confidence: 99%

Tournament Particle Swarm Optimization

Duminy

Engelbrecht

2007

2007 IEEE Symposium on Computational Intelligence and Games

View full text Add to dashboard Cite

show abstract

“…Because a novice cannot always capitalize appropriately on its own good patterns or exploit the opposition's poor ones, the learner may initially make incorrect associations, only to find them contradicted later when it plays better (Epstein, 1994c). Our learning algorithm therefore employs a confidence parameter to revalue responses in the face of disagreeing evidence.…”

Section: Managing Inconsistencymentioning

confidence: 99%

“…It learned lose tic-tac-toe and five men's morris, however, with a behavioral standard of 20 and lesson and practice training (Epstein, 1994c). In this environment (unnecessary for the easier of game tic-tac-toe), the program cycles between lessons (a set of two contests against the expert) and practice (a set of seven contests against itself).…”

Section: Correct Reflection Conceptmentioning

confidence: 99%

Pattern‐based Learning and Spatially Oriented Concept Formation in a Multi‐agent, Decision‐making Expert

Epstein

Gelfand

Lesniak

1996

Computational Intelligence

Self Cite

View full text Add to dashboard Cite

As they gain expertise in problem solving, people increasingly rely on patterns and spatially oriented reasoning. This paper describes an associative visual‐pattern classifier and the automated acquisition of new, spatially oriented reasoning agents that simulate such behavior. They are incorporated into a multi‐agent game‐learning program whose architecture robustly combines agents with conflicting perspectives. When tested on three games, the visual‐pattern classifier learns meaningful patterns, and the pattern‐based, spatially oriented agents generalized from these patterns are generally correct. The accuracy of the contribution of each of the newly created agents to the decision‐making process is measured against an expert opponent, and a perceptron‐Iike algorithm is used to learn game‐specific weights for these agents. Much of the knowledge encapsulated by the new agents was previously inexpressible in the program's representation and in some cases is not readily deducible from the rules.

show abstract

“…Experiments with Hoyle, for example, found that playing against a perfect player (a program that always makes an optimal move) was too narrow (Epstein, 1994b). An expert game player, after all, should hold its own against opponents of any strength.…”

Section: Modeling Expertisementioning

confidence: 99%

Learning Expertise with Bounded Rationality and Self-Awareness

Epstein

Petrovic

2011

Metareasoning

Self Cite

View full text Add to dashboard Cite

To address computationally challenging problems, ingenious researchers often develop a broad variety of heuristics with which to reason and learn. The integration of such good ideas into a robust, flexible environment presents a variety of difficulties, however. This paper describes how metareasoning that relies upon expertise, bounded rationality, and self-awareness supports a self-adaptive architecture for learning and problem solving. The resultant programs develop considerable skill on problems in three very different domains. They also provide insight into the strengths and pitfalls of metareasoning.Anthropologists tell us that an expert is one who performs a task better and faster than the rest of us (D'Andrade, 1991). A programmed expert for challenging problems, however, is unlikely to be given every detail of its reasoning process in advance -rather, it is expected to learn its expertise on its own, to be self-adaptive. Ideally, expertise develops quickly. To accelerate its performance during both learning and testing, a self-adaptive system is likely to be subjected to bounded rationality, that is, to have limits placed on its space and time resources. As a result, computer scientists often construct self-aware programs that observe their own behavior and monitor their own reasoning to improve their performance, as in Figure 1 (Cox and Raja, 2007). The perils of such metareasoning become quickly evident in any ambitious application, however. 1 We believe that easy problems should be solved quickly, and that hard problems should take a bit longer. Rather than rely on thousands of learning experiences, the learners we describe develop considerable expertise after experience with relatively few problems. This paper recounts the challenges posed to one learning and problem-solving ar- chitecture by three different problem domains, and how metareasoning addresses those challenges successfully. The first section describes the architecture and the domains. The second section describes the premises that led to the architecture's structure. Subsequent sections explore the impact of bounded rationality, how to assess expertise, how to manage large bodies of heuristics to learn expertise, and how to think less but still maintain performance. The Architecture and the ProblemsFORR (For the Right Reasons) is a learning and problem solving architecture that models the development of expertise with metareasoning (Epstein, 1994a). From its experience on a set of problems, FORR learns to solve other, similar problems. Together, the problems it solves and those expected to be similar to them constitute a problem class. On any given problem, FORR seeks a sequence of actions that solves the problem, and can explain the reasoning that underlies its decisions.FORR provides a flexible environment within which to design and execute experiments in metareasoning. The domain-dependent ground level in Figure 1 describes each world state as it appears during search for a solution. The object level re-represents and reason...

show abstract

Toward an ideal trainer

Cited by 25 publications

References 15 publications

Tournament Particle Swarm Optimization

Tournament Particle Swarm Optimization

Pattern‐based Learning and Spatially Oriented Concept Formation in a Multi‐agent, Decision‐making Expert

Learning Expertise with Bounded Rationality and Self-Awareness

Contact Info

Product

Resources

About