“…Our text corpora originate from the following five sources: 1) WikiText-2 (Merity et al, 2017a), a dataset of formally written Wikipedia articles (we only use the first 10% of WikiText-2 which we found to be sufficient to capture formally written text), 2) Stanford Sentiment Treebank (Socher et al, 2013), a collection of 10000 polarized written movie reviews, 3) Reddit data collected from discussion forums related to politics, electronics, and relationships, 4) MELD (Poria et al, 2019), a large-scale multimodal multi-party emotional dialog dataset collected from the TV-series Friends, and 5) POM (Park et al, 2014), a dataset of spoken review videos collected across 1,000 individuals spanning multiple topics. These datasets have been the subject of recent research in language understanding (Merity et al, 2017b;Liu et al, 2019; and multimodal human language (Liang et al, 2018(Liang et al, , 2019. Table 2 summarizes these datasets.…”