“…The accuracy of coded diagnoses in the EHR is of fundamental importance in Big Data research, 1 , 4 and there is a need to understand the factors that influence it. Studies of EHR data quality have typically assessed the usefulness of the data for specific use cases, such as disease surveillance, 5 and predictive models for specific outcomes. 6 Some studies have assessed the accuracy of coded data, 7 but as this often requires a manual review, 8 there are few studies assessing the concordance between the coded data and the clinical documentation or the factors that influence it, such the physician workflow and EHR design.…”