“…Reasoning about spatial references has been explored in various contexts such as instruction following for 2D and 3D navigation (MacMahon et al, 2006;Vogel and Jurafsky, 2010;Chen and Mooney, 2011;Artzi and Zettlemoyer, 2013;Kim and Mooney, 2013;Andreas and Klein, 2015;Fried et al, 2018;Liu et al, 2019;Jain et al, 2019;Gaddy and Klein, 2019;Hristov et al, 2019;Chen et al, 2019) and situated dialog for robotic manipulation (Skubic et al, 2002;Kruijff et al, 2007;Kelleher and Costello, 2009;Landsiedel et al, 2017). Most of these approaches utilize supervised data, either in the form of policy demonstrations or target geometric representations.…”