Vessel recognition plays important role in ensuring navigation safety. However, existing methods are mainly based on a single sensor, such as automatic identification system (AIS), marine radar, closed-circuit television (CCTV), etc. To this end, this paper proposes a coarse-to-fine recognition method by fusing CCTV and marine radar, called multi-scale matching vessel recognition (MSM-VR). This method first proposes a novel calibration method that does not use any additional calibration target. The calibration is transformed to solve an N point registration model. Furthermore, marine radar image is used for coarse detection. A region of interest (ROI) area is computed for coarse detection results. Lastly, we design a novel convolutional neural network (CNN) called VesNet and transform the recognition into feature extraction. The VesNet is used to extract the vessel features. As a result, the MVM-VR method has been validated by using actual datasets collected along different waterways such as Nanjing waterway and Wuhan waterway, China, covering different times and weather conditions. Experimental results show that the MSM-VR method can adapt to different times, different weather conditions, and different waterways with good detection stability. The recognition accuracy is no less than 96%. Compared to other methods, the proposed method has high accuracy and great robustness.