Segmentation of anatomical structures, from modalities like computed tomography (CT), magnetic resonance imaging (MRI) and ultrasound, is a key enabling technology for medical applications such as diagnostics, planning and guidance. More efficient implementations are necessary, as most segmentation methods are computationally expensive, and the amount of medical imaging data is growing. The increased programmability of graphic processing units (GPUs) in recent years have enabled their use in several areas. GPUs can solve large data parallel problems at a higher speed than the traditional CPU, while being more affordable and energy efficient than distributed systems. Furthermore, using a GPU enables concurrent visualization and interactive segmentation, where the user can help the algorithm to achieve a satisfactory result. This review investigates the use of GPUs to accelerate medical image segmentation methods. A set of criteria for efficient use of GPUs are defined and each segmentation method is rated accordingly. In addition, references to relevant GPU implementations and insight into GPU optimization are provided and discussed. The review concludes that most segmentation methods may benefit from GPU processing due to the methods' data parallel structure and high thread count. However, factors such as synchronization, branch divergence and memory usage can limit the speedup.
Heterogeneous computing, which combines devices with different architectures, is rising in popularity, and promises increased performance combined with reduced energy consumption. OpenCL has been proposed as a standard for programing such systems, and offers functional portability. It does, however, suffer from poor performance portability, code tuned for one device must be re-tuned to achieve good performance on another device. In this paper, we use machine learning-based auto-tuning to address this problem. Benchmarks are run on a random subset of the entire tuning parameter configuration space, and the results are used to build an artificial neural network based model. The model can then be used to find interesting parts of the parameter space for further search. We evaluate our method with different benchmarks, on several devices, including an Intel i7 3770 CPU, an Nvidia K40 GPU and an AMD Radeon HD 7970 GPU. Our model achieves a mean relative error as low as 6.1%, and is able to find configurations as little as 1.3% worse than the global minimum.
Summary Heterogeneous computing, combining devices with different architectures such as CPUs and GPUs, is rising in popularity and promises increased performance combined with reduced energy consumption. OpenCL has been proposed as a standard for programming such systems and offers functional portability. However, it suffers from poor performance portability, because applications must be retuned for every new device. In this paper, we use machine learning‐based auto‐tuning to address this problem. Benchmarks are run on a random subset of the tuning parameter spaces, and the results are used to build a machine learning‐based performance model. The model can then be used to find interesting subspaces for further search. We evaluate our method using five image processing benchmarks, with tuning parameter space sizes up to 2.3 M, using different input sizes, on several devices, including an Intel i7 4771 (Haswell) CPU, an Nvidia Tesla K40 GPU, and an AMD Radeon HD 7970 GPU. We compare different machine learning algorithms for the performance model. Our model achieves a mean relative error as low as 3.8% and is able to find solutions on average only 0.29% slower than the best configuration in some cases, evaluating less than 1.1% of the search space. The source code of our framework is available at https://github.com/acelster/ML‐autotuning.
As data sets continue to grow in size, visualization has become a vitally important tool for extracting meaningful knowledge. Scattered point data, which are unordered sets of point coordinates with associated measured values, arise in many contexts, such as scientific experiments, sensor networks, and numerical simulations. In this paper, we present a method for visualizing such scattered point data sets. Our method is based on volume ray casting, and distinguishes itself by operating directly on the unstructured samples, rather than resampling them to form voxels. We estimate the intensity of the volume at points along the rays by interpolation using nearby samples, taking advantage of an octree to facilitate efficient range search. The method has been implemented on multi-core CPUs, GPUs as well as multi-GPU systems. 1 To test our method, actual X-ray diffraction data sets have been used, consisting of up to 240 million data points. We are able to generate images of good quality and achieve interactive frame rates in favorable cases. The GPU implementation (Nvidia Tesla K20) achieves speedups of 8-14 compared with our parallelized CPU version (4-core, hyperthreaded Intel i7 3770K).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.