Deep neural networks (DNNs) excel at visual recognition tasks and are increasingly used as a modelling framework for neural computations in the primate brain. However, each DNN instance, just like each individual brain, has a unique connectivity and representational profile. Here, we investigate individual differences among DNN instances that arise from varying only the random initialization of the network weights. Using representational similarity analysis, we demonstrate that this minimal change in initial conditions prior to training leads to substantial differences in intermediate and higher-level network representations, despite achieving indistinguishable network-level classification performance. We locate the origins of the effects in an underconstrained alignment of category exemplars, rather than a misalignment of category centroids. Furthermore, while network regularization can increase the consistency of learned representations, considerable differences remain. These results suggest that computational neuroscientists working with DNNs should base their inferences on multiple networks instances instead of single off-the-shelf networks. Fig 1 | Characterizing network internal representations via representational similarity analysis and representational consistency. (A) Our comparisons of network internal representations were based on their multivariate activation patterns, extracted from each layer of each network instance as it responded to each of 1000 test images. (B) These high-dimensional activation vectors were then used to perform a representational similarity analysis (RSA). The fundamental building blocks of RSA are representational dissimilarity matrices (RDMs), which store all pairwise distances between the network's responses to the set of test stimuli. Each test image elicits a multivariate population response in each of the network's layers, which corresponds to a point in the respective high-dimensional activation space. The geometry of these points, captured in the RDM, provides insight into the nature of the representation, as it indicates which stimuli are grouped together, and which are separated. (C) To compare pairs of network instances, we compute their representational consistency, defined as the shared variance between network RDMs.