“…Finally, these latter methods have been superseded by approaches making use of distributional similarity in the form of both static and contextualized word embeddings (Gharbieh et al, 2016;Ehren, 2017;Senaldi et al, 2019;Liu and Hwa, 2019;Hashempour and Villavicencio, 2020;Kurfalı and Östling, 2020;Fakharian, 2021;Garcia et al, 2021;Nedumpozhimana and Kelleher, 2021), while keeping the underlying assumption unchanged, that is, the vector representation of the component words should be distant from the vector representation of the context, or of the expression as a whole.…”