Since its inception in the mid-1980s, the General Practice Research Database (GPRD) has undergone many changes but remains the largest validated and most utilised primary care database in the UK. Its use in pharmacoepidemiology stretches back many years with now over 800 original research papers. Administered by the Medicines and Healthcare products Regulatory Agency since 2001, the last 5 years have seen a rebuild of the database processing system enhancing access to the data, and a concomitant push towards broadening the applications of the database. New methodologies including real-world harm-benefit assessment, pharmacogenetic studies and pragmatic randomised controlled trials within the database are being implemented. A substantive and unique linkage program (using a trusted third party) has enabled access to secondary care data and disease-specific registry data as well as socio-economic data and death registration data. The utility of anonymised free text accessed in a safe and appropriate manner is being explored using simple and more complex techniques such as natural language processing.
ObjectiveUK primary care databases, which contain diagnostic, demographic and prescribing information for millions of patients geographically representative of the UK, represent a significant resource for health services and clinical research. They can be used to identify patients with a specified disease or condition (phenotyping) and to investigate patterns of diagnosis and symptoms. Currently, extracting such information manually is time-consuming and requires considerable expertise. In order to exploit more fully the potential of these large and complex databases, our interdisciplinary team developed generic methods allowing access to different types of user.Materials and methodsUsing the Clinical Practice Research Datalink database, we have developed an online user-focused system (TrialViz), which enables users interactively to select suitable medical general practices based on two criteria: suitability of the patient base for the intended study (phenotyping) and measures of data quality.ResultsAn end-to-end system, underpinned by an innovative search algorithm, allows the user to extract information in near real-time via an intuitive query interface and to explore this information using interactive visualization tools. A usability evaluation of this system produced positive results.DiscussionWe present the challenges and results in the development of TrialViz and our plans for its extension for wider applications of clinical research.ConclusionsOur fast search algorithms and simple query algorithms represent a significant advance for users of clinical research databases.
Abstract-There is currently no widely recognised methodology for undertaking data quality assessment in electronic health records used for research. In an attempt to address this, we have developed a protocol for measuring and monitoring data quality in primary care research databases, whereby practice-based data quality measures are tailored to the intended use of the data. Our approach was informed by an in-depth investigation of aspects of data quality in the Clinical Practice Research Datalink Gold database and presentations of the results to data users. Although based on a primary care database, much of our proposed approach would be equally applicable to other health care databases.
Abstract.We describe experiments into the use of distributional similarity for acquiring lexical information from clinical free text, in particular notes typed by primary care physicians (general practitioners). We also present a novel approach to lexical acquisition from 'sensitive' text, which does not require the text to be manually anonymised -a very expensive process -and therefore allows much larger datasets to be used than would normally be possible.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.