JASA/Computational model of attentive voice tracking Humans are able to follow a given speaker even in challenging acoustic conditions.The perceptual mechanisms underlying this ability remain unclear. In this study, we present a computational model of attentive voice tracking, consisting of four main computational blocks: A) glimpsed feature extraction, B) foreground-background segregation, C) state estimation and D) top-down knowledge. Conceptually, the model brings together ideas related to auditory glimpses, vowel segregation and Bayesian inference. Algorithmically, it combines sparse periodicity feature extraction, sequential Monte Carlo sampling and probabilistic voice models. We evaluate the model by comparing it with the data obtained in the psychoacoustic task, which measured the ability to track of one of two competing voices with time-varying parameters (fundamental frequency (F 0) and first two formants (F 1, F 2)). We test three model versions, which differ in the type of information used in the segregation stage: version 1 uses oracle F 0, version 2 uses estimated F 0 and version 3 uses estimated F 0 and oracle F 1 and F 2. Version 1 outperforms human listeners, version 2 is not sufficient and version 3 is closest to explaining human performance.