rtificial intelligence (AI) methods have the potential to revolutionize the domain of medicine, as witnessed, for example, in medical imaging, where the application of computer vision techniques, traditional machine learning 1,2 and-more recently-deep neural networks have achieved remarkable successes. This progress can be ascribed to the release of large, curated corpora of images (ImageNet 3 perhaps being the best known), giving rise to performant pre-trained algorithms that facilitate transfer learning and led to increasing publications both in oncology-with applications in tumour detection 4,5 , genomic characterization 6,7 , tumour subtyping 8,9 , grading prediction 10 , outcome risk assessment 11 or risk of relapse quantification 12 -and non-oncologic applications, such as chest X-ray analysis 13 and retinal fundus imaging 14 .To allow medical imaging AI applications to offer clinical decision support suitable for precision medicine implementations, even larger amounts of imaging and clinical data will be required. Large cross-sectional population studies based solely on volunteer participation, such as the UK Biobank 15 , cannot fill this gap. Even the largest current imaging studies in the field 4,5 , demonstrating better-than-human performance in their respective tasks, include considerably less data than, for example, ImageNet 3 , or the amount of data used to train algorithmic agents in the games of Go or StarCraft 16,17 , or autonomous vehicles 18 . Furthermore, such datasets often stem from relatively few institutions, geographic regions or patient demographics, and might therefore contain unquantifiable bias due to their incompleteness with respect to co-variables such as comorbidities, ethnicity, gender and so on 19 .However, considering that the sum of the world's patient databases probably contains enough data to answer many significant questions, it becomes clear that the inability to access and leverage this data poses a significant barrier to AI applications in this field.The lack of standardized, electronic patient records is one reason. Electronic patient data management is expensive 20 , and hospitals in underprivileged regions might be unable to afford participation in studies requiring it, potentially perpetuating the aforementioned issues of bias and fairness. In the medical imaging field, electronic data management is the standard: Digital Imaging and Communications in Medicine (DICOM) 21 is the universally adopted imaging data format, and electronic file storage is the near-global standard of care. Even where non-digital formats are still in use, the archival nature of, for instance, film radiography allows post hoc digitization, seen, for example, in the CBIS-DDSM dataset 22 , consisting of digitized film breast radiographs. Digital imaging data, easily shareable, permanently storable and remotely accessible in the cloud has driven the aforementioned successes of medical imaging AI.The second issue representing a stark deterrent from multi-institutional/multi-national AI trials 23 is ...
Using large, multi-national datasets for high-performance medical imaging AI systems requires innovation in privacy-preserving machine learning so models can train on sensitive data without requiring data transfer. Here we present PriMIA (Privacy-preserving Medical Image Analysis), a free, open-source software framework for differentially private, securely aggregated federated learning and encrypted inference on medical imaging data. We test PriMIA using a real-life case study in which an expert-level deep convolutional neural network classifies paediatric chest X-rays; the resulting model's classification performance is on par with locally, non-securely trained models. We theoretically and empirically evaluate our framework's performance and privacy guarantees, and demonstrate that the protections provided prevent the reconstruction of usable data by a gradient-based model inversion attack. Finally, we successfully employ the trained model in an end-to-end encrypted remote inference scenario using secure multi-party computation to prevent the disclosure of the data and the model.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.