This article presents the longitudinal trilingual corpus of young learners of Italian, German and English called LEONIDE. The corpus consists of L1, L2 and L3 learner texts. L1 texts were written in two languages of schooling (i.e. Italian and German), L2 texts in two languages learned as second languages (i.e. German and Italian), and L3 texts in an additional foreign language (i.e. English). All texts were collected from a group of lower secondary school pupils from the multilingual Italian province of South Tyrol whose development in all three languages was observed over a period of three years. Each text comes with rich metadata as well as manual and automatic annotations.
With language resources being collected in many - also small - projects in learner corpus research with considerate amounts of time and ef- fort spent in this activity, making these types of data available in a FAIR way, with standardized and reasoned methods, would contribute substan- tially to the advancement of the field. Additionally, it would answer current demands in transparency, replicability and reusability. In this article, we dis- cuss some of the challenges when making learner corpora FAIR and report from experiences in fulfilling this aim while creating a learner corpus infra- structure at a research institution hosting five different learner corpora.
Up until today research in various educational and linguistic domains such as learner corpus research, writing research, or second language acquisition has produced a substantial amount of research data in the form of L1 and L2 learner corpora. However, the multitude of individual solutions combined with domain-inherent obstacles in data sharing have so far hampered comparability, reusability and reproducibility of data and research results. In this article, we present work in creating a digital infrastructure for L1 and L2 learner corpora and populating it with data collected in the past. We embed our infrastructure efforts in the broader field of infrastructures for scientific research, drawing from technical solutions and frameworks from research data management, among which the FAIR guiding principles for data stewardship. We share our experiences from integrating some L1 and L2 learner corpora from concluded projects into the infrastructure while trying to ensure compliance with the FAIR principles and the standards we established for reproducibility, discussing how far research data that has been collected in the past can be made comparable, reusable and reproducible. Our results show that some basic needs for providing comparable and reusable data are covered by existing general infrastructure solutions and can be exploited for domain-specific infrastructures such as the one presented in this article. Other aspects need genuinely domain-driven approaches. The solutions found for the corpora in the presented infrastructure can only be a preliminary attempt, and further community involvement would be needed to provide templates and models acknowledged and promoted by the community. Furthermore, forward-looking data management would be needed starting from the beginning of new corpus creation projects to ensure that all requirements for FAIR data can be met.
The social media hype these days is omnipresent, encouraging even public institutions to participate. This study seeks to reveal, which factors have to be kept in mind, when doing social media work at universities. It also is an attempt to provide a list of recommendations and possible fields of action to ensure an efficient presence in social web. Therefore we analyzed the present situation of university efforts and evaluated the success by measuring user engagement concerning different aspects of social media activities (e.g. content, publishing time, frequency of activities, existence of visual elements, additional links, etc.) The study shows, that it seems less important how many times a week a university is publishing, or how long the text messages are in detail, but that there is a significant relationship between the contents of a post, the time of its publishing and the used elements, pointing out that users actively perceive and interact with social media activities that encourage contact between both: the profile-owner with the community and the community amongst itself-especially if made in a personal, emotional or funny way, offering people ways to identify with the institution and to connect with it through well-known habits and traditions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.