A Stochastic Learning Model of Economic Behavior

Cross, John G.

doi:10.2307/1882186

Cited by 155 publications

(80 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Example 1 [Cross, 1973] The set of states is S = ∆(A), where each component of s ∈ ∆(A) corresponds to each action in A. The state transition rule, π :…”

Section: Examplesmentioning

confidence: 99%

“…In game theory, since payoffs are considered to be von Neumann-Morgestern utility (see, e.g., Fudenberg and Tirole (1991)), u has been interpreted as a Bernoulli utility function. 9 One may conjecture that if u is concave, then the resulting learning rule is monotonically risk averse. An argument analogous to the one we used for March's (1996) Weighted Return over Gains model reveals that this conjecture is false for any ρ > 0, because the belief-based learning rule fails to be globally impartial.…”

Section: Examplesmentioning

confidence: 99%

“…In particular, we do not need to assume that individuals have Bernoulli utility functions or that they play any role in the learning process. Examples of learning models that satisfy our assumptions include those in Bush and Mosteller (1951), Cross (1973), Roth and Erev (1995), March (1996), and Börgers and Sarin (2000).…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Learning and risk aversion

Oyarzun

Sarin

2013

Journal of Economic Theory

View full text Add to dashboard Cite

We study the manner in which learning shapes behavior towards risk when individuals are not assumed to know, or to have beliefs about, probability distributions. In any period, the behavior change induced by learning is assumed to depend on the action chosen and the payoff obtained. We characterize learning processes that, in expected value, increase the probability of choosing the safest (or riskiest) actions and provide sufficient conditions for them to converge, in the long run, to the choices of risk averse (or risk seeking) expected utility maximizers. We provide a learning theoretic motivation for long run risk choices, such as those in expected utility theory with known payoff distributions. * We thank two anonymous referees and an associate editor for instructive comments. We also thank

show abstract

“…Example 1 [Cross, 1973] The set of states is S = ∆(A), where each component of s ∈ ∆(A) corresponds to each action in A. The state transition rule, π :…”

Section: Examplesmentioning

confidence: 99%

Section: Examplesmentioning

confidence: 99%

See 1 more Smart Citation

Learning and risk aversion

Oyarzun

Sarin

2013

Journal of Economic Theory

View full text Add to dashboard Cite

show abstract

“…We end this section with an example of a learning rule due to Cross (1973). In the next section we shall show that all absolutely expedient or monotone learning rules have a structure that is similar to the structure of Cross' learning rule.…”

Section: Absolute Expediency and Monotonicitymentioning

confidence: 87%

“…A necessary condition for both absolute expediency and monotonicity is that the decision maker uses Cross' (1973) learning rule, or a modified version of this learning rule. 2 Cross' rule requires that the decision maker raise the probability of the strategy that he or she chose in proportion to the payoff received, and that all other choice probabilities be reduced proportionally.…”

mentioning

confidence: 99%

Expedient and Monotone Learning Rules

Börgers¹,

Morales²,

Sarin³

2004

Econometrica

View full text Add to dashboard Cite

This paper considers learning rules for environments in which little prior and feedback information is available to the decision maker. Two properties of such learning rules are studied: absolute expediency and monotonicity. Both require that some aspect of the decision maker's performance improves from the current period to the next. The paper provides some necessary, and some sufficient conditions for these properties. It turns out that there is a large variety of learning rules that have the properties. However, all learning rules that have these properties are related to the replicator dynamics of evolutionary game theory. For the case in which there are only two actions, it is shown that one of the absolutely expedient learning rules dominates all others.

show abstract

A re‐examination of probability matching and rational choice

Shanks

Tunney

McCarthy

2002

Behavioral Decision Making

241

230

View full text Add to dashboard Cite

In a typical probability learning task participants are presented with a repeated choice between two response alternatives, one of which has a higher payoff probability than the other. Rational choice theory requires that participants should eventually allocate all their responses to the high-payoff alternative, but previous research has found that people fail to maximize their payoffs. Instead, it is commonly observed that people match their response probabilities to the payoff probabilities. We report three experiments on this choice anomaly using a simple probability learning task in which participants were provided with (i) large financial incentives, (ii) meaningful and regular feedback, and (iii) extensive training. In each experiment large proportions of participants adopted the optimal response strategy and all three of the factors mentioned above contributed to this. The results are supportive of rational choice theory. Copyright # 2002 John Wiley & Sons, Ltd.key words probability matching; maximization; choice; rationality; feedback; payoffs; learning; reinforcement A striking violation of rational choice theory is commonly observed in simple repeated binary choice tasks in which a payoff is available with higher probability given one response than another. In such tasks people often tend to 'match' probabilities: That is, they allocate their responses to the two options in proportion to their relative payoff probabilities. Thus suppose that a monetary payoff of fixed size is given with probability p ¼ 0.7 for choosing left and with probability 1 À p ¼ 0.3 for choosing right. Probability matching refers to behavior in which left is chosen on about 70% of trials and right on 30%. Such responding violates rational choice theory because the optimal strategy in such tasks, after an initial period of experimentation and assuming that the payoff probabilities are stationary, is always to select the option associated with the higher probability of payoff. On any trial, the expected payoff for choosing left is higher than the expected payoff for choosing right.

show abstract

A Stochastic Learning Model of Economic Behavior

Cited by 155 publications

References 0 publications

Learning and risk aversion

Learning and risk aversion

Expedient and Monotone Learning Rules

A re‐examination of probability matching and rational choice

Contact Info

Product

Resources

About