This paper considers how fluent language users are rational in their language processing, their unconscious language representation systems optimally prepared for comprehension and production, how language learners are intuitive statisticians, and how acquisition can be understood as contingency learning. But there are important aspects of second language acquisition that do not appear to be rational, where input fails to become intake. The paper describes the types of situation where cognition deviates from rationality and it introduces how the apparent irrationalities of L2 acquisition result from standard phenomena of associative learning as encapsulated in the models of Rescorla and Wagner (1972) and Cheng and Holyoak (1995), which describe how cue salience, outcome importance, and the history of learning from multiple probabilistic cues affect the development of 'learned selective attention' and transfer. methods that abstract this type of information-contingency learning according to the one-way dependency statistic and the Probability Contrast Model (Cheng and Holyoak 1995), and the way that human associative learning accords the predictions of these methods (Shanks 1995). When first and second language acquisition are considered in these terms, L1 acquisition seems much more obviously rational and contingency-sensitive than does L2 acquisition. I lay the foundations for a companion article (Ellis 2006) that describes the types of situation where associative learning deviates from rationality and that argues that the apparent irrationalities of L2 acquisition result from standard phenomena of associative learning: attentional shifting in perceptual learning, latent inhibition, blocking, overshadowing, and other effects of salience, transfer, and inhibition. I describe how 'learned attention' explains these apparently irrational effects, and how theories of animal and human associative learning include selective attention as a key component. The human learning mechanism optimizes its representations of first language from the cumulative sample of first language input. But the initial state for L2 is not a tabula rasa, it is a tabula repleta: the optimal solution for L2 is not that for L1, and L2 acquisition suffers from various types of L1 interference. Thus the shortcomings of the L2 end-state are more rational when seen through the lenses of the L1. THE DESIGN OF AN OPTIMAL WORD PROCESSOR Consider word processing programs you have known. You probably have strong views. Remember the one that crashed at 2 a.m. losing your only copy, the one that took ten minutes to search and replace, the one that required perverse three-letter combination commands, and the one you have now that replaces spellings, styles, and punctuations for you, whether you like it or not, and you still cannot figure how to get it to stop? Besides the obvious requirements for speed and reliability, an optimal word processing program would really know what you wanted to do next, and would present you with your next needed command, file, o...