In this paper, we examined heterogeneous architectures, for their suitability to run the scale invariant feature transformation (SIFT) algorithm in real time. The SIFT is one of the most robust as well as one of the most computational intensive algorithms to extract local features in many machine-vision applications. Many ongoing researches presented methods on improving the SIFT execution time. However, described techniques focus only on improving the SIFT execution time on a single homogeneous device. To address the gap in improving SIFT algorithm execution time on multi-device heterogeneous platforms we have prepared the OpenCL-SIFT implementation. We have described techniques to efficiently parallelize the application that contains many different computing cores. By a careful optimization process, we presented the performance portable implementation, for an efficient processing on various multi-device heterogeneous platforms. The experimental results showed that our implementation obtains appropriate accuracy and higher efficiency compared to recent open-source SIFT implementations. Using proposed methods we extracted SIFT features with more than 30 FPS on Full-HD images with different processor architectures. Additionally to increase the performance, we showed efficient (in average speed-up of 2.69×) multi-device scheduling methods for SIFT feature extraction. Finally, we described guidelines to optimize GPGPU-OpenCL programs for ×86 multi-core CPUs. The discussed methods are generic and may be used for the design of other algorithms
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.