Acute kidney injury (AKI) is a common complication among oncology patients associated with lower remission rates and higher mortality. To reduce the impact of this condition, we aimed to predict AKI earlier than existing tools, to allow clinical intervention before occurrence. We trained a random forest model on 597,403 routinely collected blood test results from 48,865 patients undergoing cancer treatment at The Christie NHS Foundation Trust between January 2017 and May 2020, to identify AKI events upcoming in the next 30 days. AKI risk levels were assigned to upcoming AKI events and tested through a prospective analysis between June and August 2020. The trained model gave an AUROC of 0.881 (95% CI 0.878–0.883), when assessing predictions per blood test for AKI occurrences within 30 days. Assigning risk levels and testing the model through prospective validation from the 1st June to the 31st August identified 73.8% of patients with an AKI event before at least one AKI occurrence, 61.2% of AKI occurrences. Our results suggest that around 60% of AKI occurrences experienced by patients undergoing cancer treatment could be identified using routinely collected blood results, allowing clinical remedial action to be taken and disruption to treatment by AKI to be minimised.
ObjectiveColorectal cancer is a common cause of death and morbidity. A significant amount of data are routinely collected during patient treatment, but they are not generally available for research. The National Institute for Health Research Health Informatics Collaborative in the UK is developing infrastructure to enable routinely collected data to be used for collaborative, cross-centre research. This paper presents an overview of the process for collating colorectal cancer data and explores the potential of using this data source.MethodsClinical data were collected from three pilot Trusts, standardised and collated. Not all data were collected in a readily extractable format for research. Natural language processing (NLP) was used to extract relevant information from pseudonymised imaging and histopathology reports. Combining data from many sources allowed reconstruction of longitudinal histories for each patient that could be presented graphically.ResultsThree pilot Trusts submitted data, covering 12 903 patients with a diagnosis of colorectal cancer since 2012, with NLP implemented for 4150 patients. Timelines showing individual patient longitudinal history can be grouped into common treatment patterns, visually presenting clusters and outliers for analysis. Difficulties and gaps in data sources have been identified and addressed.DiscussionAlgorithms for analysing routinely collected data from a wide range of sites and sources have been developed and refined to provide a rich data set that will be used to better understand the natural history, treatment variation and optimal management of colorectal cancer.ConclusionThe data set has great potential to facilitate research into colorectal cancer.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.