“…Instead of obtaining in-the-wild knowledge, recent works leveraged entity background information obtained from knowledge graphs (Cui, Seo, Tabar, Ma, Wang and Lee, 2020;Zhang, Fang, Qian and Xu, 2019;Hu, Yang, Zhang, Zhong, Tang, Shi, Duan and Zhou, 2021). For multi-modal scenarios, entity knowledge is important to bridge the text-image semantics (Xue, Wang, Tian, Li, Shi and Wei, 2021;Qi, Cao and Sheng, 2021b;Qi, Cao, Li, Liu, Sheng, Mi, He, Lv, Guo and Yu, 2021a;Li, Sun, Yu, Tian, Yao and Xu, 2021). These methods could provide accurate and explainable evidence, but have the issue of source credibility and scalability.…”