“…Potential improvements include, for example, accounting for the original multi-label nature of emotion classification, or covering more than only 20 emoji in emoji prediction. There are also other scenarios to be addressed as well, like sequence tagging (Baldwin et al, 2015;Gimpel et al, 2018), multimodality (Schifanella et al, 2016;Lu et al, 2018), and codeswitching tasks (Barman et al, 2014;Vilares et al, 2016). This is similar to the evolution of GLUE (Wang et al, 2019b) into SuperGLUE (Wang et al, 2019a), with both benchmarks contributing to the development of the field in different ways.…”