American people enjoy ratings and rankings-athletic teams, restaurants, music albums, television shows, movies, automobiles, books, hotels, cities, and even colleges are featured in annual top 10, top 100, or best of lists. The publications that sponsor such articles command wide appeal among consumers and have spawned a growing rating industry. The rules that underlie such ratings have many features in common: (1) important but poorly defined concepts, like the road handling of a car, are quantified using conveniently available performance metrics (eg, acceleration and braking distances) or subjective surveys of experience; (2) these proxies are then combined into an overall index using arbitrarily chosen decision rules; and (3) the composite index is translated into a ranking list or, more often, a quasi-metric such as a star rating. Not surprisingly, how one selects the individual metrics and constructs decision rules for the composite can dramatically reorder rankings. 1 For situations such as sports or cinema, ranking lists (eg, league tables) and star ratings promote friendly jousting and harmless amusement. In other settings, transparency about performance may help consumers choose a superior product and may potentially prompt a complacent business to improve the quality of and user satisfaction with its products. The rub comes when public policy directs such tactics at complex services such as education or health care, where outcome stakes are higher. When the public is paying for services, it is natural to demand added value for the investment and to request that such value be conveyed in a way that nonexperts can understand. Composites, rankings, and their star-rating derivatives can potentially serve such a purpose. When a choice of health care group is feasible, these shortcuts may promote better decision making by the public than more technical presentations of detailed performance comparisons. 2 Yet, for years experts have raised cautionary flags regarding the inherent statistical uncertainties in using league tables or similar approaches. 3,4 The policy conundrum becomes further amplified when dubious methods of ordering are used to adjust payment. Health outcomes are demonstrably dependent on underlying health risks determined by genetic, social, educational, economic, and behavioral factors that are beyond the immediate ability of health care professionals to influence. Clinicians who disproportionately serve disadvantaged populations have been shown to score lower on rankings and would therefore be likely to experience a decrease in reimbursement under zero-sum pay-for-performance plans. 5 It would be puzzling social policy to shift those dollars toward health care professionals serving the healthy, wealthy, and wise, but that may be the unintended consequence of shortcuts such as composites and star ratings. An important and timely empirical contribution to this debate is made by Nguyen and colleagues. 6 In a cross-sectional analysis, they compared the performance on standard performance measur...