Using large, multi-national datasets for high-performance medical imaging AI systems requires innovation in privacy-preserving machine learning so models can train on sensitive data without requiring data transfer. Here we present PriMIA (Privacy-preserving Medical Image Analysis), a free, open-source software framework for differentially private, securely aggregated federated learning and encrypted inference on medical imaging data. We test PriMIA using a real-life case study in which an expert-level deep convolutional neural network classifies paediatric chest X-rays; the resulting model's classification performance is on par with locally, non-securely trained models. We theoretically and empirically evaluate our framework's performance and privacy guarantees, and demonstrate that the protections provided prevent the reconstruction of usable data by a gradient-based model inversion attack. Finally, we successfully employ the trained model in an end-to-end encrypted remote inference scenario using secure multi-party computation to prevent the disclosure of the data and the model.
Data privacy mechanisms are essential for rapidly scaling medical training databases to capture the heterogeneity of patient data distributions toward robust and generalizable machine learning systems. In the current COVID-19 pandemic, a major focus of artificial intelligence (AI) is interpreting chest CT, which can be readily used in the assessment and management of the disease. This paper demonstrates the feasibility of a federated learning method for detecting COVID-19 related CT abnormalities with external validation on patients from a multinational study. We recruited 132 patients from seven multinational different centers, with three internal hospitals from Hong Kong for training and testing, and four external, independent datasets from Mainland China and Germany, for validating model generalizability. We also conducted case studies on longitudinal scans for automated estimation of lesion burden for hospitalized COVID-19 patients. We explore the federated learning algorithms to develop a privacy-preserving AI model for COVID-19 medical image diagnosis with good generalization capability on unseen multinational datasets. Federated learning could provide an effective mechanism during pandemics to rapidly develop clinically useful AI across institutions and countries overcoming the burden of central aggregation of large amounts of sensitive data.
The evolving dynamics of coronavirus disease 2019 (COVID-19) and the increasing infection numbers require diagnostic tools to identify patients at high risk for a severe disease course. Here we evaluate clinical and imaging parameters for estimating the need of intensive care unit (ICU) treatment. We collected clinical, laboratory and imaging data from 65 patients with confirmed COVID-19 infection based on polymerase chain reaction (PCR) testing. Two radiologists evaluated the severity of findings in computed tomography (CT) images on a scale from 1 (no characteristic signs of COVID-19) to 5 (confluent ground glass opacities in over 50% of the lung parenchyma). The volume of affected lung was quantified using commercially available software. Machine learning modelling was performed to estimate the risk for ICU treatment. Patients with a severe course of COVID-19 had significantly increased interleukin (IL)-6, C-reactive protein (CRP), and leukocyte counts and significantly decreased lymphocyte counts. The radiological severity grading was significantly increased in ICU patients. Multivariate random forest modelling showed a mean ± standard deviation sensitivity, specificity and accuracy of 0.72 ± 0.1, 0.86 ± 0.16 and 0.80 ± 0.1 and a receiver operating characteristic-area under curve (ROC-AUC) of 0.79 ± 0.1. The need for ICU treatment is independently associated with affected lung volume, radiological severity score, CRP, and IL-6.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.