PurposeThis study addresses the challenge of low resolution and signal‐to‐noise ratio (SNR) in diffusion‐weighted images (DWI), which are pivotal for cancer detection. Traditional methods increase SNR at high b‐values through multiple acquisitions, but this results in diminished image resolution due to motion‐induced variations. Our research aims to enhance spatial resolution by exploiting the global structure within multicontrast DWI scans and millimetric motion between acquisitions.MethodsWe introduce a novel approach employing a “Perturbation Network” to learn subvoxel‐size motions between scans, trained jointly with an implicit neural representation (INR) network. INR encodes the DWI as a continuous volumetric function, treating voxel intensities of low‐resolution acquisitions as discrete samples. By evaluating this function with a finer grid, our model predicts higher‐resolution signal intensities for intermediate voxel locations. The Perturbation Network's motion‐correction efficacy was validated through experiments on biological phantoms and in vivo prostate scans.ResultsQuantitative analyses revealed significantly higher structural similarity measures of super‐resolution images to ground truth high‐resolution images compared to high‐order interpolation (p 0.005). In blind qualitative experiments, of super‐resolution images were assessed to have superior diagnostic quality compared to interpolated images.ConclusionHigh‐resolution details in DWI can be obtained without the need for high‐resolution training data. One notable advantage of the proposed method is that it does not require a super‐resolution training set. This is important in clinical practice because the proposed method can easily be adapted to images with different scanner settings or body parts, whereas the supervised methods do not offer such an option.