Europe is a multilingual society, in which dozens of languages are spoken. The only op tion to enable and to benefit from multilingual ism is through Language Technologies (LT), i. e., Natural Language Processing and Speech Technologies. We describe the European Lan guage Grid (ELG), which is targeted to evolve into the primary platform and marketplace for LT in Europe by providing one umbrella plat form for the European LT landscape, includ ing research and industry, enabling all stake holders to upload, share and distribute their ser vices, products and resources. At the end of our EU project, which will establish a legal en tity in 2022, the ELG will provide access to ap prox. 1300 services for all European languages as well as thousands of data sets.
In the language domain, as in other domains, neural explainability takes an ever more important role, with feature attribution methods on the forefront. Many such methods require considerable computational resources and expert knowledge about implementation details and parameter choices. To facilitate research, we present THERMOSTAT which consists of a large collection of model explanations and accompanying analysis tools. THER-MOSTAT allows easy access to over 200k explanations for the decisions of prominent stateof-the-art models spanning across different NLP tasks, generated with multiple explainers. The dataset took over 10k GPU hours (> one year) to compile; compute time that the community now saves. The accompanying software tools allow to analyse explanations instance-wise but also accumulatively on corpus level. Users can investigate and compare models, datasets and explainers without the need to orchestrate implementation details.THERMOSTAT is fully open source, democratizes explainability research in the language domain, circumvents redundant computations and increases comparability and replicability.
Amid a discussion about Green AI in which we see explainability neglected, we explore the possibility to efficiently approximate computationally expensive explainers. To this end, we propose the task of feature attribution modelling that we address with Empirical Explainers. Empirical Explainers learn from data to predict the attribution maps of expensive explainers. We train and test Empirical Explainers in the language domain and find that they model their expensive counterparts well, at a fraction of the cost. They could thus mitigate the computational burden of neural explanations significantly, in applications that tolerate an approximation error.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.