The National Quality Forum (NQF) evaluates healthcare performance measures for endorsement based on a broad set of criteria. We extracted data from NQF technical reports released between spring 2018 and spring 2019. Measures were commonly stewarded by federal agencies (44.29%), evaluated for maintenance (67.14%), classified as outcome (42.14%) or process (39.29%) measures, and used a statistical model for risk adjustment (48.57%). For 80% of the measures reviewed, a patient advocate was present on the reviewing committee. Validity was evaluated using face validity (65.00%) or score-level empirical validity (67.14%), and reliability was frequently evaluated using score-level testing (71.43%). Although 91.56% of all reviewed measures were endorsed, most standing committee members voted moderate rather than high support on key assessment criteria like measure validity, measure reliability, feasibility of use, and whether the measure addresses a key performance gap. Results show that although the Consensus Development Process includes multidisciplinary stakeholder input and thorough evaluations of measures, continued work to identify and describe appropriate and robust methods for reliability and validity testing is needed. Further work is needed to study the extent to which stakeholder input is truly representative of diverse viewpoints and improve processes for considering social factors when risk adjusting.