There are a large number of cameras in modern transportation system that capture numerous vehicle images continuously. Therefore, automatic analysis of these vehicle images is helpful for traffic flow management, criminal investigations and vehicle inspections. Vehicle matching, which aims to determine whether two input images depict an identical vehicle, is one of the core tasks in vehicle analysis. Recent relevant studies have focused on local feature extraction instead of global extraction, since local details can provide crucial cues to distinguish between cars. However, these methods do not select local features; that is, they do not assign weights to local features. Therefore, in this research, we systematically study the vehicle matching task, and present a novel annotation‐free local‐based deep learning method called Adaptive super‐pixel discriminative feature‐selective learning (ASDFL) to address this issue. In ASDFL, vehicle images are segmented into clusters of super‐pixels of similar size by considering the location and colour similarities of pixels without using any component‐level annotation. These super‐pixels are deemed to be the virtual components of vehicles. Moreover, a convolutional neural network is used to extract the deep features of these virtual components. Thereafter, an instance‐specific mask generation module driven by the extracted global features is enhanced to produce a mask to select the most distinctive virtual components of each vehicle image pair in the feature space. Finally, the vehicle matching task is accomplished by classifying the selected virtual component features of each imaged vehicle pair. Extensive experiments on two popular vehicle identification benchmarks demonstrate that our method is 1.57% and 0.8% more accurate than the previous baselines in a vehicle matching task on the VeRi and VehicleID datasets, respectively, which demonstrates the effectiveness of our method.