Capturing RFID tags in the region of interest (ROI) is challenging. Many issues, such as multipath interference, frequency-dependent hardware characteristics and phase periodicity, make RF phase difficult to accurately indicate the tag-to-antenna distance for RFID tag localization. In this paper, we propose a comprehensive solution, called RF-Focus, which fuses RFID and computer vision (CV) techniques to recognize and locate moving RFID-tagged objects within ROI. Firstly, we build a multipath propagation model and propose a dual-antenna solution to minimize the impact of multipath interference on RF phase. Secondly, by extending the multipath model, we estimate phase shifts due to hardware characteristics at different operating frequencies. Thirdly, to minimize the tag position uncertainty due to RF phase periodicity, we leverage CV to extract image regions of being likely to contain ROI RFID-tagged objects, and then associate them with the processed RF phase after the removal of the phase shifts due to multipath interference and hardware characteristics for recognition and localization. Our experiments demonstrate the effectiveness of multipath modelling and hardware-related phase shift estimation. When five RFID-tagged objects are moving in the ROI, RF-Focus achieves the average recognition accuracy of 91.67% and localization accuracy of 94.26% given a false positive rate of 10%.