Purpose When using convolutional neural networks (CNNs) for segmentation of organs and lesions in medical images, the conventional approach is to work with inputs and outputs either as single slice [two‐dimensional (2D)] or whole volumes [three‐dimensional (3D)]. One common alternative, in this study denoted as pseudo‐3D, is to use a stack of adjacent slices as input and produce a prediction for at least the central slice. This approach gives the network the possibility to capture 3D spatial information, with only a minor additional computational cost. Methods In this study, we systematically evaluate the segmentation performance and computational costs of this pseudo‐3D approach as a function of the number of input slices, and compare the results to conventional end‐to‐end 2D and 3D CNNs, and to triplanar orthogonal 2D CNNs. The standard pseudo‐3D method regards the neighboring slices as multiple input image channels. We additionally design and evaluate a novel, simple approach where the input stack is a volumetric input that is repeatably convolved in 3D to obtain a 2D feature map. This 2D map is in turn fed into a standard 2D network. We conducted experiments using two different CNN backbone architectures and on eight diverse data sets covering different anatomical regions, imaging modalities, and segmentation tasks. Results We found that while both pseudo‐3D methods can process a large number of slices at once and still be computationally much more efficient than fully 3D CNNs, a significant improvement over a regular 2D CNN was only observed with two of the eight data sets. triplanar networks had the poorest performance of all the evaluated models. An analysis of the structural properties of the segmentation masks revealed no relations to the segmentation performance with respect to the number of input slices. A post hoc rank sum test which combined all metrics and data sets yielded that only our newly proposed pseudo‐3D method with an input size of 13 slices outperformed almost all methods. Conclusion In the general case, multislice inputs appear not to improve segmentation results over using 2D or 3D CNNs. For the particular case of 13 input slices, the proposed novel pseudo‐3D method does appear to have a slight advantage across all data sets compared to all other methods evaluated in this work.
Deep learning methods have proven extremely effective at performing a variety of medical image analysis tasks. With their potential use in clinical routine, their lack of transparency has however been one of their few weak points, raising concerns regarding their behavior and failure modes. While most research to infer model behavior has focused on indirect strategies that estimate prediction uncertainties and visualize model support in the input image space, the ability to explicitly query a prediction model regarding its image content offers a more direct way to determine the behavior of trained models. To this end, we present a novel Visual Question Answering approach that allows an image to be queried by means of a written question. Experiments on a variety of medical and natural image datasets show that by fusing image and question features in a novel way, the proposed approach achieves an equal or higher accuracy compared to current methods.
In the last years, deep learning has dramatically improved the performances in a variety of medical image analysis applications. Among different types of deep learning models, convolutional neural networks have been among the most successful and they have been used in many applications in medical imaging.Training deep convolutional neural networks often requires large amounts of image data to generalize well to new unseen images. It is often time-consuming and expensive to collect large amounts of data in the medical image domain due to expensive imaging systems, and the need for experts to manually make ground truth annotations. A potential problem arises if new structures are added when a decision support system is already deployed and in use. Since the field of radiation therapy is constantly developing, the new structures would also have to be covered by the decision support system.In the present work, we propose a novel loss function to solve multiple problems: imbalanced datasets, partiallylabeled data, and incremental learning. The proposed loss function adapts to the available data in order to utilize all available data, even when some have missing annotations. We demonstrate that the proposed loss function also works well in an incremental learning setting, where an existing model is easily adapted to semi-automatically incorporate delineations of new organs when they appear. Experiments on a large in-house dataset show that the proposed method performs on par with baseline models, while greatly reducing the training time and eliminating the hassle of maintaining multiple models in practice.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.