“…Syntactic information can be used on either the source-side (Eriguchi, Tsuruoka, and Cho 2017), or the target-side (Aharoni and Goldberg 2017), or both (Wu, Zhang, Zhang, Yang, Li, and Zhou 2018). Syntactic information can be represented as constituent trees (Eriguchi, Hashimoto, and Tsuruoka 2016), packed forests (Ma, Tamura, Utiyama, Zhao, and Sumita 2018), or graphs (Hashimoto and Tsuruoka 2017). There are also some attempts at using syntactic information explicitly in Transformer (Strubell, Verga, Andor, Weiss, and McCallum 2018).…”