“…Note that architectures, optimizers, and pooling layer differ slightly between the different implementations. For SOP and Landmarks, we use ResNet-18, GeM pooling, and SGD, [38]-R-GeM and [40] use ResNet-101, GeM pooling, and Adam, [33] uses BN-Inception, a linear projection, and RMSProp, [34] use Google LeNet, a linear projection, and SGD, and [47] use a Resnet50, a combination of average and max pooling followed by a linear projection, and AdamW. For TinyImageNet, we use ResNet-32 and SGD, and [36] uses ResNet-32 and Adam.…”