Vehicle re-identification (ReID) means to identify the target vehicle in large-scale surveillance videos captured by multiple cameras, where robust and distinctive visual features of vehicles are critical to the performance. Recently, the researchers have approached the problem with attention based models. However, most of these models use strongly-supervised methods, which rely on expensive extra labels, e.g., keypoints(vehicle wheels , logo and lamps) and attributes(e.g., color and type). Therefore, we propose a joint metric learning approach to solve the problem. We present an end-to-end Partition and Fusion Multi-branch Network (PFMN), a novel approach to effectively learn discriminative features without any annotations or additional attributes. For hard samples, which means different vehicles with similar appearance or the same vehicle with different appearances, a novel variant of hard sampling triplet loss is proposed. Based on extensive experiments, we have proved the effectiveness of our proposed method. On the challenging public data sets VeRi-776 and VehicleID, our model outperforms most state-of-the-art algorithms on mAP and rank-1. Especially on mINP, which measures the cost of model retrieval hard samples, we can achieve a significant improvement.