This paper concerns a problem of the latency reduction in the vision-based mobile robot navigation, which is considered as the crucial system property to determine a control command based on visual data in practical deployments of mobile robots. The problem is addressed by a processor centric FPGA-based System-on-Chip design allowing power and computationally efficient on-line image processing. The proposed architecture is considered in an autonomous visionbased navigation with a teach-and-repeat algorithm based on detection and tracking of image salient points. The architecture has been evaluated and compared with a CPU-based solution on different platforms and the results indicate that the proposed FPGA-based implementation outperforms pure CPU solutions in the overall latency, speed, and power consumption.