Visual search is one of the most ecologically important perceptual task domains. One research tradition has studied visual search using simple, parametric stimuli and a signal detection theory or Bayesian modeling framework. However, this tradition has mostly focused on homogeneous distractors (distractors that are identical to each other), which are not very realistic. In a different tradition, Duncan and Humphreys (1989) conducted a landmark study on visual search in the presence of heterogeneous distractors. However, they used complex stimuli, making modeling and dissociation of component processes more difficult. Here, we attempt to unify these research traditions by systematically examining visual search with heterogeneous distractors using simple, parametric stimuli and a Bayesian modeling framework. Specifically, our experiment varied multiple factors that could potentially influence performance: set size, task (N-AFC localization vs detection), whether the target was revealed before or after the search array (perception versus memory), and stimulus spacing. Across all conditions, we found that performance decreased with increasing set size. We then examined various within-trial summary statistics, and found that the minimum target-to-distractor feature difference was a stronger predictor of behavior than the mean target-to-distractor difference and than distractor variance. To move from summary statistics to process-level understanding, we formulated a Bayesian optimal-observer model with a variable-precision encoding stage. This model, which makes trial-by-trial predictions, accurately accounted for all summary statistics. This was still the case when we fitted the model jointly to the localization and detection data. We replicated these results in a separate experiment with reduced stimulus spacing. Together, our results represent a critique of Duncan and Humphrey's purely descriptive approach, bring visual search with heterogeneous distractors firmly within the reach of quantitative process models, and affirm the unreasonable effectiveness of Bayesian models in explaining visual search.