Additive and multiplicative regression models of habituation were compared regarding the fit to looking times from a habituation experiment with infants aged between 3 and 11 months. In contrast to earlier studies, the current study considered multiple probability distributions, namely Weibull, gamma, lognormal and normal distribution. In the habituation experiment the type of contrast between the habituation and the test trial was varied (luminance, color or orientation contrast), crossed with the number of habituation trials (1, 3, 5, or 7 habituation trials) and crossed with three age cohorts (4, 7, 10 months). The initial mean LT to dark stimuli (around 3.7 s) was considerably shorter than the mean LT to green and gray stimuli (around 5 s). Infants showed the strongest dishabituation to changes from dark to bright (luminance contrast) and weak-to-no dishabituation to a 90-degrees rotation of the gray stimuli (orientation contrast). The dishabituation was stronger after five and seven habituation trials, but the result was not statistically robust. The gamma distribution showed the best fit in terms of log-likelihood and mean absolute error and the best predictive performance. Furthermore, the gamma distribution showed small correlations between parameters relative to other models. The normal additive model showed an inferior fit and medium correlations between the parameters. In particular, the positive correlation between the initial looking time (LT) and the habituation rate was likely responsible for a different interpretation relative to the multiplicative models of the main effect of age on the habituation rate. Otherwise, the additive and multiplicative models provided similar statistical conclusions. The performance of the model versions without pooling and with partial pooling across participants (also called random-effects, multi-level or hierarchical models) were compared. The latter type of models showed worse data fit but more precise predictions and reduced correlations between the parameters. The performance of model variants with auto-regressive time structures were explored but showed considerably worse fit. The performance of quadratic models that allowed non-monotonic changes in LTs were investigated as well. However, when fitted with LT data, these models did not produce non-monotonic change in LTs. The study underscores the utility of partial-pooling models in terms of providing more accurate predictions. Further, it agrees with previous research in that a multiplicative LT model is preferable. Nevertheless, the current results suggest that the impact of the choice of an additive model on the statistical inference is less dramatic then previously assumed.