Partial least squares (PLS) is one of the most common regression algorithms in chemistry, relating input-output samples (x i , y i ) by a linear multivariate model. In this paper we analyze the PLS algorithm under a specific probabilistic model for the relation between x and y. Following Beer's law, we assume a linear mixture model in which each data sample (x, y) is a random realization from a joint probability distribution where x is the sum of k components multiplied by their respective characteristic responses, and each of these components is a random variable. We analyze PLS on this model under two idealized settings: one is the ideal case of noise-free samples and the other is the case of an infinite number of noisy training samples. In the noise-free case we prove that, as expected, the regression vector computed by PLS is, up to normalization, the net analyte signal. We prove that PLS computes this vector after at most k iterations, where k is the total number of components. In the case of an infinite training set corrupted by unstructured noise, we show that PLS computes a final regression vector which is not in general purely proportional to the net analyte signal vector, but has the important property of being optimal under a mean squared error of prediction criterion. This result can be viewed as an asymptotic optimality of PLS in the limit of a very large but finite training set.