PURPOSEWe assessed interrater reliability (IRR) of chart abstractors within a randomized trial of cardiovascular care in primary care. We report our fi ndings, and outline issues and provide recommendations related to determining sample size, frequency of verifi cation, and minimum thresholds for 2 measures of IRR: the κ statistic and percent agreement.
METHODSWe designed a data quality monitoring procedure having 4 parts: use of standardized protocols and forms, extensive training, continuous monitoring of IRR, and a quality improvement feedback mechanism. Four abstractors checked a 5% sample of charts at 3 time points for a predefi ned set of indicators of the quality of care. We set our quality threshold for IRR at a κ of 0.75, a percent agreement of 95%, or both.RESULTS Abstractors reabstracted a sample of charts in 16 of 27 primary care practices, checking a total of 132 charts with 38 indicators per chart. The overall κ across all items was 0.91 (95% confi dence interval, 0.90-0.92) and the overall percent agreement was 94.3%, signifying excellent agreement between abstractors. We gave feedback to the abstractors to highlight items that had a κ of less than 0.70 or a percent agreement less than 95%. No practice had to have its charts abstracted again because of poor quality.CONCLUSIONS A 5% sampling of charts for quality control using IRR analysis yielded κ and agreement levels that met or exceeded our quality thresholds. Using 3 time points during the chart audit phase allows for early quality control as well as ongoing quality monitoring. Our results can be used as a guide and benchmark for other medical chart review studies in primary care.