“…More recently, some hybrid architectures (Rong et al, 2020;Ying et al, 2021;Min et al, 2022) of GNNs and transformers are emerging to capture the topological structures of molecular graphs. Additionally, given that the available labels for molecules are often expensive or incorrect (Xia et al, 2021;Tan et al, 2021;Xia et al, 2022a), the emerging self-supervised pre-training strategies (You et al, 2020;Xia et al, 2022c;Yue et al, 2022;Liu et al, 2023) on graph-structured data are promising for molecular graph data (Hu et al, 2020;Xia et al, 2023a;Gao et al, 2022), just like the overwhelming success of pre-trained language models in natural language processing community (Devlin et al, 2019;Zheng et al, 2022).…”