Re-ranking algorithms have been proposed to improve the effectiveness of content-based image retrieval systems by exploiting contextual information encoded in distance measures and ranked lists. In this paper, we show how we improved the efficiency of one of these algorithms, called Contextual Spaces Re-Ranking (CSRR). One of our approaches consists in parallelizing the algorithm with OpenCL to use the central and graphics processing units of an accelerated processing unit. The other is to modify the algorithm to a version that, when compared with the original CSRR, not only reduces the total running time of our implementations by a median of 1:6 but also increases the accuracy score in most of our test cases. Combining both parallelization and algorithm modification results in a median speedup of 5:4 from the original serial CSRR to the parallelized modified version. Different implementations for CSRR's Re-sort Ranked Lists step were explored as well, providing insights into graphics processing unit sorting, the performance impact of image descriptors, and the trade-offs between effectiveness and efficiency. vectors, which are later used for similarity assessment among images. A CBIR system ranks the collection images by decreasing order of similarity, and because users consider mostly top-ranked images, it is imperative that the rank be as accurate as possible.In recent years, several successful attempts to increase the effectiveness (quality of results) of CBIR systems have been performed [1][2][3][4][5][6]. In particular, re-ranking methods have been used to improve the effectiveness of CBIR systems by exploiting contextual information encoded in similarity scores and ranked lists. These methods are, on the other hand, very costly as they are based on comparing collection images multiple times. In a real-world scenario, CBIR systems require both good effectiveness and efficiency (response time), so re-ranking methods must be improved.Central processing units (CPUs) no longer have just one core, and graphics processing units (GPUs) are now being used as general purpose processors due to having evolved into massive parallel architectures capable of executing hundreds of operations per cycle [7]. These devices have been successfully used to accelerate re-ranking [7, 8] and retrieval [9] systems, obtaining good speedups.Therefore, alternatives that increase performance with parallelization seem to be a possible fit for the Contextual Spaces Re-Ranking (CSRR) algorithm [2], which we discuss in this paper. Another possible approach consists in analyzing the compromises between accuracy and performance that come out of modifying existing algorithms, because this can lead to eliminating demanding work.Our solution first exploits the use of parallelization to speed up the more costly steps of the CSRR algorithm, obtaining speedups of up to 3:3 for the Compute Distances step and 5:1 for the Re-sort Ranked Lists step on an accelerated processing unit (APU). We then propose a modification to this algorithm, which by itse...