This paper has proposed an architecture of optimised SIFT (Scale Invariant Feature Transform) feature detection for an FPGA implementation of an image matcher. In order for SIFT based image matcher to be implemented on an FPGA efficiently, in terms of speed and hardware resource usage, the original SIFT algorithm has been significantly optimised in the following aspects: 1) Upsampling has been replaced with downsampling to save the interpolation operation. 2) Only four scales with two octaves are needed for our image matcher with moderate degradation of matching performance. 3) The total dimension of the feature descriptor has been reduced to 72 from 128 of the original SIFT, which leads to significantly simplify the image matching operation. With the optimisation above, the proposed FPGA implementation is able to detect the features of a typical image of 640x480 pixels within 31 milliseconds. Therefore, compared with the existing SIFT FPGA implementation, which requires 33 milliseconds for an image of 320x240 pixels, a significant improvement has been achieved for our proposed architecture.
Figure 1: Qualitative rectified results of our Document Image Transformer (DocTr). The top row shows the distorted document images. The second row shows the rectified results after geometric unwarping and illumination correction.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.