In many domestic and military applications, aerial vehicle detection and super-resolution algorithms are frequently developed and applied independently. However, aerial vehicle detection on superresolved images remains a challenging task due to the lack of discriminative information in the super-resolved images. To address this problem, we propose a Joint Super-Resolution and Vehicle Detection Network (Joint-SRVDNet) that tries to generate discriminative, high-resolution images of vehicles from low-resolution aerial images. First, aerial images are up-scaled by a factor of 4x using a Multi-scale Generative Adversarial Network (MsGAN), which has multiple intermediate outputs with increasing resolutions. Second, a detector is trained on super-resolved images that are upscaled by factor 4x using MsGAN architecture and finally, the detection loss is minimized jointly with the super-resolution loss to encourage the target detector to be sensitive to the subsequent super-resolution training. The network jointly learns hierarchical and discriminative features of targets and produces optimal super-resolution results. We perform both quantitative and qualitative evaluation of our proposed network on VEDAI, xView and DOTA datasets. The experimental results show that our proposed framework achieves better visual quality than the state-of-the-art methods for aerial super-resolution with 4x up-scaling factor and improves the accuracy of aerial vehicle detection.