“…For the combination, some alternatives have been proposed, such as different input channels of a convolutional neural network (Kim, 2014;Zhang et al, 2016), concatenation followed by dimensionality reduction (Yin and Schütze, 2016) or averaging of embeddings (Coates and Bollegala, 2018), e.g., for combining embeddings from multiple languages (Lange et al, 2020b;Reid et al, 2020). More recently, auto-encoders (Bollegala and Bao, 2018;Wu et al, 2020), ensembles of sentence encoders (Poerner et al, 2020) and attentionbased methods (Kiela et al, 2018;Lange et al, 2019a) have been introduced. The latter allows a dynamic (input-based) combination of multiple embeddings.…”