A foundational assumption of human communication is that speakers should say as much as necessary, but no more. Yet, people routinely produce redundant adjectives and their propensity to do so varies crosslinguistically. Here, we propose a computational theory, whereby speakers create referential expressions designed to facilitate listeners' reference resolution, as they process words in real time. We present a computational model of our account, the Incremental Collaborative Efficiency (ICE) model, which generates referential expressions by considering listeners' real-time incremental processing and reference identification. We apply the ICE framework to physical reference, showing that listeners construct expressions designed to minimize listeners' expected visual search effort during online language processing. Our model captures a number of known effects in the literature, including cross-linguistic differences in speakers' propensity to over-specify. Moreover, the ICE model predicts graded acceptability judgments with quantitative accuracy, systematically outperforming an alternative, brevity-based model. Our findings suggest that physical reference production is best understood as driven by a collaborative goal to help the listener identify the intended referent, rather than by an egocentric effort to minimize utterance length.