“…Mikolov et al, 2013) are known to struggle with rare words, several techniques for improving their representations have been proposed. These approaches exploit either the contexts in which rare words occur (Lazaridou et al, 2017;Herbelot and Baroni, 2017;Khodak et al, 2018;Liu et al, 2019a), their surfaceform (Luong et al, 2013;Bojanowski et al, 2017;Pinter et al, 2017), or both (Schick and Schütze, 2019a,b;Hautte et al, 2019). However, all of this prior work is designed for and evaluated on uncontextualized word embeddings.…”