Background: Research abstracts are submitted for presentation at scientific conferences; however, criteria for judging abstracts are variable. We sought to develop two rigorous abstract scoring rubrics for education research submissions reporting (1) quantitative data and (2) qualitative data and then to collect validity evidence to support score interpretation.
Methods:We used a modified Delphi method to achieve expert consensus for scoring rubric items to optimize content validity. Eight education research experts participated in two separate modified Delphi processes, one to generate quantitative research items and one for qualitative. Modifications were made between rounds based on item scores and expert feedback. Homogeneity of ratings in the Delphi process was calculated using Cronbach's alpha, with increasing homogeneity considered an indication of consensus. Rubrics were piloted by scoring abstracts from 22 quantitative publications from AEM Education and Training "Critical Appraisal of Emergency Medicine Education Research" (11 highlighted for excellent methodology and 11 that were not) and 10 qualitative publications (five highlighted for excellent methodology and five that were not). Intraclass correlation coefficient (ICC) estimates of reliability were calculated.Results: Each rubric required three rounds of a modified Delphi process. The resulting quantitative rubric contained nine items: quality of objectives, appropriateness of methods, outcomes, data analysis, generalizability, importance to medical education, innovation, quality of writing, and strength of conclusions (Cronbach's α for the third round = 0.922, ICC for total scores during piloting = 0.893). The resulting qualitative rubric contained seven items: quality of study aims, general methods, data collection, sampling, data analysis, writing quality, and strength of conclusions (Cronbach's α for the third round = 0.913, ICC for the total scores during piloting = 0.788).