“…Hyperspherical learning. Beyond face recognition, the idea of learning a representation on the hypersphere is also shown generally useful in a diverse set of applications, such as few-shot recognition [46], [47], [48], [49], [50], deep metric learning [26], [51], self-supervised learning [52], [53], [54], [55], generative models [56], [57], geometric learning [58], [59], [60], person re-identification [27], [61], [62], [63], speech processing [64], [65], [66], [67] and text processing [68]. It has been widely observed that constraining the embedding space on a hypersphere is beneficial to generalizability.…”