“…In many works [5,7,10], α(w) is exponentially smoothed by a coefficient lower than 1, optimized on heldout data. However, in our experiments, we chose to use (10) as it is, since this paper does not seek to perfectly tune a LM adaptation but rather aims at better understanding mechanisms that are useful for topic adaptation.…”