Model soups synthesize multiple models after fine-tuning them with different hyperparameters based on the accuracy of the validation data. They train different models on the same training and validation data sets. In this study, we maximized the model fine-tuning accuracy using the inference time and memory cost of a single model. We extended the model soups to create subsets of k training and validation data using a method similar to k-fold cross-validation and trained models on these subsets. First, we showed the correlation between the validation and test data when the models are synthesized, such that their training data contain validation data. Thereafter, we showed that synthesizing k of these models, after synthesizing models based on subsets of the same training and validation data, provides a single model with high test accuracy. This study provides a method for learning models with both high accuracy and reliability for small datasets such as medical images.
Out-of-distribution (OOD) detection, the classification of samples not included in the training data, is essential to improve the reliability of deep learning. Recently, the accuracy of OOD detection through unsupervised representation learning is high; however, the accuracy of in-distribution classification (IN-D) is reduced. This is due to the cross entropy, which trains the network to predict shifting transformations (such as angles) for OOD detection. Cross entropy loss conflicts with the consistency in representation learning; that is, samples with different data augmentations applied to the same sample should share the same representation. To avoid this problem, we add the Jensen-Shannon divergence (JSD) consistency loss. To demonstrate its effectiveness for both OOD detection and IN-D classification, we apply it to contrasting shifted instances (CSI) based on the latest representation learning. Our experiments demonstrate that JSD consistency loss outperforms existing methods in both OOD detection and IN-D classification for unlabeled multi-class datasets.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.