Deep learning, due to its unprecedented success in tasks such as image classification, has emerged as a new tool in image reconstruction with potential to change the field. In this paper, we demonstrate a crucial phenomenon: Deep learning typically yields unstable methods for image reconstruction. The instabilities usually occur in several forms: 1) Certain tiny, almost undetectable perturbations, both in the image and sampling domain, may result in severe artefacts in the reconstruction; 2) a small structural change, for example, a tumor, may not be captured in the reconstructed image; and 3) (a counterintuitive type of instability) more samples may yield poorer performance. Our stability test with algorithms and easy-to-use software detects the instability phenomena. The test is aimed at researchers, to test their networks for instabilities, and for government agencies, such as the Food and Drug Administration (FDA), to secure safe use of deep learning methods.
There is overwhelming empirical evidence that Deep Learning (DL) leads to unstable methods in applications ranging from image classification and computer vision to voice recognition and automated diagnosis in medicine. Recently, a similar instability phenomenon has been discovered when DL is used to solve certain problems in computational science, namely, inverse problems in imaging. In this paper we present a comprehensive mathematical analysis explaining the many facets of the instability phenomenon in DL for inverse problems. Our main results not only explain why this phenomenon occurs, they also shed light as to why finding a cure for instabilities is so difficult in practice. Additionally, these theorems show that instabilities are typically not rare events -rather, they can occur even when the measurements are subject to completely random noise -and consequently how easy it can be to destablise certain trained neural networks. We also examine the delicate balance between reconstruction performance and stability, and in particular, how DL methods may outperform state-of-the-art sparse regularization methods, but at the cost of instability. Finally, we demonstrate a counterintuitive phenomenon: training a neural network may generically not yield an optimal reconstruction method for an inverse problem.
Significance
Instability is the Achilles’ heel of modern artificial intelligence (AI) and a paradox, with training algorithms finding unstable neural networks (NNs) despite the existence of stable ones. This foundational issue relates to Smale’s 18th mathematical problem for the 21st century on the limits of AI. By expanding methodologies initiated by Gödel and Turing, we demonstrate limitations on the existence of (even randomized) algorithms for computing NNs. Despite numerous existence results of NNs with great approximation properties, only in specific cases do there also exist algorithms that can compute them. We initiate a classification theory on which NNs can be trained and introduce NNs that—under suitable conditions—are robust to perturbations and exponentially accurate in the number of hidden layers.
Artificial Intelligence (AI) has the potential to greatly improve the delivery of healthcare and other services that advance population health and wellbeing. However, the use of AI in healthcare also brings potential risks that may cause unintended harm. To guide future developments in AI, the High-Level Expert Group on AI set up by the European Commission (EC), recently published ethics guidelines for what it terms “trustworthy” AI. These guidelines are aimed at a variety of stakeholders, especially guiding practitioners toward more ethical and more robust applications of AI. In line with efforts of the EC, AI ethics scholarship focuses increasingly on converting abstract principles into actionable recommendations. However, the interpretation, relevance, and implementation of trustworthy AI depend on the domain and the context in which the AI system is used. The main contribution of this paper is to demonstrate how to use the general AI HLEG trustworthy AI guidelines in practice in the healthcare domain. To this end, we present a best practice of assessing the use of machine learning as a supportive tool to recognize cardiac arrest in emergency calls. The AI system under assessment is currently in use in the city of Copenhagen in Denmark. The assessment is accomplished by an independent team composed of philosophers, policy makers, social scientists, technical, legal, and medical experts. By leveraging an interdisciplinary team, we aim to expose the complex trade-offs and the necessity for such thorough human review when tackling socio-technical applications of AI in healthcare. For the assessment, we use a process to assess trustworthy AI, called 1Z-Inspection® to identify specific challenges and potential ethical trade-offs when we consider AI in practice.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.