“…Many of them achieve the SOTA performance on the downstream linear classification task with the backbone network fixed (Zhang, Isola, and Efros 2016;Oord, Li, and Vinyals 2018;Bachman, Hjelm, and Buchwalter 2019). However, little attention has been paid to training small models (Howard et al 2017;Tan and Le 2019) solely under the contrastive learning framework, for its failure has been widely observed (Koohpayegani, Tejankar, and Pirsiavash 2020;Fang et al 2021;Xu et al 2021;Gu, Liu, and Tian 2021). In this paper, we want to fill in the void of training small models with and only with contrastive learning signals.…”