The coronavirus disease (COVID-19), caused by the SARS-CoV-2 virus, was declared a pandemic by the World Health Organization (WHO) in February 2020. Currently, there are no vaccines or treatments that have been approved after clinical trials. Social distancing measures, including travel bans, school closure, and quarantine applied to countries or regions are being used to limit the spread of the disease, and the demand on the healthcare infrastructure. The seclusion of groups and individuals has led to limited access to accurate information. To update the public, especially in South Africa, announcements are made by the minister of health daily. These announcements narrate the confirmed COVID-19 cases and include the age, gender, and travel history of people who have tested positive for the disease. Additionally, the South African National Institute for Communicable Diseases updates a daily infographic summarising the number of tests performed, confirmed cases, mortality rate, and the regions affected. However, the age of the patient and other nuanced data regarding the transmission is only shared in the daily announcements and not on the updated infographic. To disseminate this information, the Data Science for Social Impact research group at the University of Pretoria, South Africa, has worked on curating and applying publicly available data in a way that is computer readable so that information can be shared to the public-using both a data repository and a dashboard. Through collaborative practices, a variety of challenges related to publicly available data in South Africa came to the fore. These include shortcomings in the accessibility, integrity, and data management practices between governmental departments and the South African public. In this paper, solutions to these problems will be shared by using a publicly available data repository and dashboard as a case study.
Research in NLP lacks geographic diversity, and the question of how NLP can be scaled to low-resourced languages has not yet been adequately solved. "Lowresourced"-ness is a complex problem going beyond data availability and reflects systemic problems in society. * ∀ to represent the whole Masakhane community.As MT researchers cannot solve the problem of low-resourcedness alone, we propose participatory research as a means to involve all necessary agents required in the MT development process. We demonstrate the feasibility and scalability of participatory research with a case study on MT for African languages. Its implementation leads to a collection of novel translation datasets, MT benchmarks for over 30 languages, with human evaluations for a third of them, and enables participants without formal training to make a unique scientific contribution. Benchmarks, models, data, code, and evaluation results are released at https://github. com/masakhane-io/masakhane-mt.
Research in NLP lacks geographic diversity, and the question of how NLP can be scaled to low-resourced languages has not yet been adequately solved. "Lowresourced"-ness is a complex problem going beyond data availability and reflects systemic problems in society. * ∀ to represent the whole Masakhane community.As MT researchers cannot solve the problem of low-resourcedness alone, we propose participatory research as a means to involve all necessary agents required in the MT development process. We demonstrate the feasibility and scalability of participatory research with a case study on MT for African languages. Its implementation leads to a collection of novel translation datasets, MT benchmarks for over 30 languages, with human evaluations for a third of them, and enables participants without formal training to make a unique scientific contribution. Benchmarks, models, data, code, and evaluation results are released at https://github. com/masakhane-io/masakhane-mt.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.