In unmanned aerial vehicle (UAV) image-processing applications, one needs to implement different parallel image-enhancement algorithms on several high-performance computing platforms utilizing various programming models. To speed up the parallelization procedure and improve its efficiency, the automatic parallel software package, Par4All, is applied in this work. We find that the performance of the original automatic parallelization algorithm produced with Par4All is inefficient. To resolve this problem, we propose different optimization approaches for Par4All based on Intel®'s Xeon Phi high-performance computing platform that are based on the structural features of the image-enhancement algorithms, which can further optimize the original parallel algorithm. These approaches mainly include: (1) Par4All automatic parallel search module optimization, (2) dynamic thread setting optimization, and (3) the collaborative parallelization of both CPU and many integrated core (MIC) processors.According to the results of the comparison experiments involving different algorithms, it is shown that the proposed optimization approaches for these kinds of algorithms can significantly improve the performance of automatic parallel algorithms. The acceleration ratio increases approximately by 30%, 70%, and 80% for the multiscale Retinex, Gaussian-filtering and median-filtering algorithms, respectively. As continuation and deepening of our previous research work, this research has the potential to be beneficial for other researchers in image-processing applications with image-enhancement algorithms.
K E Y W O R D Sautomatic parallelism, image-enhancement algorithms, Par4All, Intel® Xeon Phi, unmanned aerial vehicles
INTRODUCTIONAs mentioned by References 1, it is known that some image-enhancement algorithms, for example, median filtering, Gaussian filtering, wavelet transform, and the multiscale Retinex (MSR) algorithm, are used when preprocessing large amounts of unmanned aerial vehicle (UAV) remote-sensing (RS) images to solve the problems of poor clarity, insufficient contrast and low adaptability. In addition, many-cores computing platforms, e.g., graphics processing units (GPUs) and many integrate cores (MICs), have been involved in the development of some parallel algorithms to speed up the processing times 2 ; such approaches are expected to become mainstream in the near future. Experimental results have shown that parallelized algorithms can significantly improve computation speed and provide better acceleration efficiency. [3][4][5] Usually, there are different kinds of serial