“…Researchers propose various context-aware networks to utilize contextual information to improve the performance of DocNMT models on the translation quality (Jean et al, 2017;Tu et al, 2018;Kuang et al, 2018) or discourse phenomena (Bawden et al, 2018;Voita et al, 2019b,a). However, most methods roughly leverage all context sentences in a fixed size that is tuned on development sets (Wang et al, 2017;Miculicich et al, 2018;Yang et al, 2019;Xu et al, 2020) , or full context in the entire document (Maruf and Haffari, 2018;Tan et al, 2019;Kang and Zong, 2020;Zheng et al, 2020). They ignore the individualized needs for context when translating different source sentences.…”