Deep learning has recently gained attention in the atmospheric and oceanic sciences for its potential to improve the accuracy of numerical simulations or to reduce computational costs. Super‐resolution is one such technique for high‐resolution inference from low‐resolution data. This paper proposes a new scheme, called four‐dimensional super‐resolution data assimilation (4D‐SRDA). This framework calculates the time evolution of a system from low‐resolution simulations using a physics‐based model, while a trained neural network simultaneously performs data assimilation and spatio‐temporal super‐resolution. The use of low‐resolution simulations without ensemble members reduces the computational cost of obtaining inferences at high spatio‐temporal resolution. In 4D‐SRDA, physics‐based simulations and neural‐network inferences are performed alternately, possibly causing a domain shift, that is, a statistical difference between the training and test data, especially in offline training. Domain shifts can reduce the accuracy of inference. To mitigate this risk, we developed super‐resolution mixup (SR‐mixup)–a data augmentation method for domain generalization. SR‐mixup creates a linear combination of randomly sampled inputs, resulting in synthetic data with a different distribution from the original data. The proposed methods were validated using an idealized barotropic ocean jet with supervised learning. The results suggest that the combination of 4D‐SRDA and SR‐mixup is effective for robust inference cycles. This study highlights the potential of super‐resolution and domain‐generalization techniques, in the field of data assimilation, especially for the integration of physics‐based and data‐driven models.