“…Recently, several studies have focused on the remarkable potential of pre-trained language models, such as BERT (Devlin et al, 2019), in capturing linguistic knowledge. They have shown that pretrained representations are able to encode various linguistic properties (Tenney et al, 2019a;Talmor et al, 2020;Goodwin et al, 2020;Wu et al, 2020;Zhou and Srikumar, 2021;Chen et al, 2021;Tenney et al, 2019b), among others, syntactic, such as part of speech (Liu et al, 2019a) and dependency tree (Hewitt and Manning, 2019), and semantic, such as word senses (Reif et al, 2019) and semantic dependency (Wu et al, 2021).…”