Radio-frequency identification (RFID) localization has drawn much attention with the emergence of the Internet of Things (IoT). Deep learning with applications to RFID localization owns a lot of advantages. In this paper, we present a deep convolutional neural network (CNN)-based approach for passive RFID tag localization exploiting joint fingerprint features of the received signal strength indication (RSSI) and phase difference of arrival (PDOA). First, the RSSI and PDOA data are extracted from the received signals of RFID readers. Then, a CNN with three convolution layers and pooling layers is design, in which the normalized RSSI and PDOA data are formed into images as its input to train the weights in the offline stage. In the online stage, the RSSI and PDOA data of test tags are collect and then the positions of unknown tags is predected based on the designed CNN. In the simulations, the accuracy of the proposed approach is compared with several fingerprinting-based schemes such as LANDMARC, weighted K-nearest neighbor (WKNN) and deep neural network (DNN), and the impact of different fingerprint data sets and noise variances on the positioning accuracy is analyzed. Experiments show that the proposed approach can locate multiple tags with high accuracy and stability in complex indoor environment, and outperforms other existing schemes.