“…We use a combination of 17 datasets for our largestscale training data retrieval. The datasets include αNLI , SWAG (Zellers et al, 2018), RACE (Lai et al, 2017) (we only use the middle-school subset), CODAH ), RiddleSense (Lin et al, 2021, SciTail , Com2Sense (Singh et al, 2021), AI2 Science Questions (Clark et al, 2019), Wino-Grade , CommonsenseQA (Talmor et al, 2019), CommonsenseQA2.0 (Talmor et al, 2021, ASQ (Fu et al, 2019), OBQA (Mihaylov et al, 2018), PhysicalIQA (Bisk et al, 2020), SocialIQA (Sap et al, 2019b), CosmosQA (Huang et al, 2019) and HellaSWAG (Zellers et al, 2019). We present details of the datasets that we use for training data retrieval in Table 6.…”