Background We evaluate the construct validity of a proposed procedure for eliciting lay preferences among health care policy options, suited for structured surveys. It is illustrated with breast cancer screening, a domain in which people may have heterogeneous preferences. Methods Our procedure applies behavioral decision research principles to eliciting preferences among policy options expressed in quantitative terms. Three-hundred women older than 18 y without a history of breast cancer were recruited through Amazon MTurk. Participants evaluated 4 screening options for each of 4 groups of women, with varying risk of breast cancer. Each option was characterized by estimates of 3 primary outcomes: breast cancer deaths, false alarms, and overdiagnosis resulting in unnecessary treatment of cancers that would not progress. These estimates were based on those currently being developed by the Breast Cancer Surveillance Consortium. For each risk group, participants stated how frequently they would wish to receive screening, if the predicted outcomes applied to them. Results A preregistered test found that preferences were robust enough to be unaffected by the order of introducing and displaying the outcomes. Other tests of construct validity also suggested that respondents generally understood the task and expressed consistent preferences. Those preferences were related to participants’ age and mammography history but not to measures of their numeracy, subjective numeracy, or demographics. There was considerable heterogeneity in their preferences. Conclusions Members of the public can be engaged more fully in informing future screening guidelines if they evaluate the screening options characterized by the expected health outcomes expressed in quantitative terms. We offer and evaluate such a procedure, in terms of its construct validity with a diverse sample of women. Highlights A novel survey method for eliciting lay preferences for breast cancer screening is proposed and evaluated in terms of its construct validity. Participants were generally insensitive to irrelevant task features (e.g., order of presentation) and sensitive to relevant ones (e.g., quantitative estimates of breast cancer risk, harms from screening). The proposed method elicits lay preferences in terms that can inform future screening guidelines, potentially improving communication between the public and policy makers.