“…For each trajectory in the retrieval batch, we represent each time-step within a trajectory by a set of two vectors h i,t and b i,t (Figure 6 in the appendix) where h i,t summarizes the past (i.e., from t = 0 to t = t time-steps of the i th trajectory) while b i,t summarizes the future (i.e., from t = t to t = time-steps) within the i th trajectory. In addition, taking inspiration from (Jaderberg et al, 2016;Trinh et al, 2018;Ke et al, 2019;Devlin et al, 2018;Mazoure et al, 2020), we use auxiliary losses to improve modeling of long term dependencies when training the parameters of our forward and backward summarizers. The goal of these losses is to force the representation (h i,t , b i,t ) i,t≥0 to capture meaningful information for the unknown downstream task.…”