“…Since the introduction of BERT (Devlin et al, 2019), the research community has witnessed remarkable progress in the field of language model pre-training on a large amount of free text. Such advancements have led to significant progresses in a wide range of natural language understanding (NLU) tasks Yang et al, 2019;Clark et al, 2020;Lan et al, 2021) and text generation tasks (Radford et al, 2019;Lewis et al, 2020;Raffel et al, 2020;Su et al, 2021a,e,g,d,f,c;Zhong et al, 2021) Contrastive Learning. Generally, contrastive learning methods distinguish observed data points from fictitious negative samples.…”