Multi-input multi-output structures have been developed to boost performance by learning multiple ensemble members at a small additional cost to a single network. There were several attempts to further develop multi-input multi-output structures; however, integrating the benefits of self-supervised learning into a multi-input multi-output structure has not yet been studied. In this work, we develop a multiinput multi-output structure designed to jointly learn original and self-supervised tasks, thereby leveraging the benefits of self-supervised learning. Specifically, in terms of multiple inputs, we improve the mixing strategy and minibatch structure for rotation-based self-supervised learning technique, and in terms of multiple outputs, we extend the label space of multiple classifiers to predict both the original class and true rotation degree. We observe that our method with wider networks on CIFAR-10, CIFAR-100, and Tiny ImageNet datasets shows better performance compared to previous works, even with nearly half the number of parameters, e.g., using only about 45.8% of the number of parameters compared to the bestperforming multi-input multi-output method, MixMo, in the Tiny ImageNet dataset, while still achieving a 2.01% improvement.
INDEX TERMS convolution neural networks (CNNs), deep ensemble, multi-input multi-output networkThis article has been accepted for publication in IEEE Access.