Self‐supervised learning allows training of neural networks without immense, high‐quality or labelled data sets. We demonstrate that self‐supervision furthermore improves robustness of models using small, imbalanced or incomplete data sets which pose severe difficulties to supervised models. For small data sets, the accuracy of our approach is up to 12.5% higher using MNIST and 15.2% using Fashion‐MNIST compared to random initialization. Moreover, self‐supervision influences the way of learning itself, which means that in case of small or strongly imbalanced data sets, it can be prevented that classes are not or insufficiently learned. Even if input data are corrupted and large image regions are missing from the training set, self‐supervision significantly improves classification accuracy (up to 7.3% for MNIST and 2.2% for Fashion‐MNIST). In addition, we analyse combinations of data manipulations and seek to generate a better understanding of how pretext accuracy and downstream accuracy are related. This is not only important to ensure optimal pretraining but also for training with unlabelled data in order to find an appropriate evaluation measure. As such, we make an important contribution to learning with realistic data sets and making machine learning accessible to application areas that require expensive and difficult data collection.