Semi-supervised deep learning (SSDL) is a popular strategy to leverage unlabelled data for machine learning when labelled data is not readily available. In real-world scenarios, different unlabelled data sources are usually available, with varying degrees of distribution mismatch regarding the labelled datasets. It begs the question which unlabelled dataset to choose for good SSDL outcomes. Oftentimes, semantic heuristics are used to match unlabelled data with labelled data. However, a quantitative and systematic approach to this selection problem would be preferable. In this work, we first test the SSDL MixMatch algorithm under various distribution mismatch configurations to study the impact on SSDL accuracy. Then, we propose a quantitative unlabelled dataset selection heuristic based on dataset dissimilarity measures. These are designed to systematically assess how distribution mismatch between the labelled and unlabelled datasets affects MixMatch performance. We refer to our proposed method as deep dataset dissimilarity measures (DeDiMs), designed to compare labelled and unlabelled datasets. They use the feature space of a generic Wide-ResNet, can be applied prior to learning, are quick to evaluate and model agnostic. The strong correlation in our tests between MixMatch accuracy and the proposed DeDiMs suggests that this approach can be a good fit for quantitatively ranking different unlabelled datasets prior to SSDL training.Impact Statement-Semi-supervised deep learning is a technique for training a deep learning model when few labelled observations are available, leveraging unlabelled datasets. Different unlabelled data sources may be available, introducing the possibility for distribution mismatches between the labelled and unlabelled datasets. In this work we assess the impact of distribution mismatches on the outcomes of the semi-supervised MixMatch algorithm. We propose a set of simple feature-space density dataset distances, referred to as deep dataset dissimilarity measures (DeDiMs). In our extensive test-bed, the evaluated DeDiMs yield Manuscript