Image recognition is a very useful technique that can be applied in many areas. Two-Dimensional Continuous Dynamic Programming (2DCDP) is a pixel-level matching algorithm for object recognition. Compared with other methods, 2DCDP can offer a sufficiently high accuracy of recognition without training. In our previous work we use 2DCDP to implement image classification. However, we find the processing speed of 2DCDP is very slow. In this paper, we first analyze the performance issue of 2DCDP algorithm, and point out that large memory consumption is the performance bottleneck. Then, we improve 2DCDP algorithm and propose a new object recognition algorithm named Pixelbased Multi-Anchor (PMA) algorithm, which can locate anchor points that can be further used to locate the recognized area. Theoretical analysis expresses that our new algorithm can effectively reduce memory capacity requirement from O(n 4 ) to O(n 3 ), where n is the size of image. Furthermore, based on the understanding of multi-core architecture, we propose a fine-grained parallelism thread model to parallelize our PMA algorithm on mutli-core systems. Especially we take cache coherence problem into account, such that we further propose a coarse-grained parallelism thread model to optimize the PMA performance. Experimental results show that compared with the original 2DCDP algorithm, our new PMA algorithm can decrease the memory capacity requirement dramatically which improves the recognition speed. More important, PMA algorithm can processes efficiently big images that exceed the ability of original 2DCDP algorithm.