In this data paper, we present a specimen-based occurrence dataset compiled in the framework of the Conservation of Endemic Central African Trees (ECAT) project with the aim of producing global conservation assessments for the IUCN Red List. The project targets all tree species endemic or sub-endemic to the Central African region comprising the Democratic Republic of the Congo (DR Congo), Rwanda, and Burundi. The dataset contains 6361 plant collection records with occurrences of 8910 specimens from 337 taxa belonging to 153 genera in 52 families. Many of these tree taxa have restricted geographic ranges and are only known from a small number of herbarium specimens. As assessments for such taxa can be compromised by inadequate data, we transcribed and geo-referenced specimen label information to obtain a more accurate and complete locality dataset. All specimen data were manually cleaned and verified by botanical experts, resulting in improved data quality and consistency.
Many, if not most, countries have several official or widely used languages. And most, if not all, of these countries have herbaria. Furthermore, specimens have been exchanged between herbaria from many countries, so herbaria are often polylingual collections. It is therefore useful to have label transcription systems that can attract users proficient in a wide variety of languages. Belgium is a typical polylingual country at the boundary between the Romance and Franconian languages (French, Dutch & German). Yet, currently there are few non-English transcription platforms for citizen science. This is why in Belgium we built DoeDat, from the Digivol system of the Atlas of Living Australia. We will be demonstrating DoeDat and its multilingual features. We will explain how we enter translations, both for the user interface and for the dynamic parts of the website. We will share our experiences of running a multilingual site and the challenges it brings. Translating and running such a website requires skilled personnel and patience. However, our experience has been positive and the number and quality of our volunteer transcriptions has been rewarding. We look forward to the further use of DoeDat to transcribe data in many other languages. There are no reasons anymore to exclude willing volunteers in any language.
Herbarium specimens hold a wealth of data about plants; where they come from, where they were collected and by whom. Once digitized, these data can be searched, mapped and compared. However, the information on specimens is often handwritten and even the best software systems cannot read it. This is where we get real value from citizen involvement. Digitizing these data is only possible with the aid of human intelligence. DoeDat is a multilingual open-source platform for transcription, based upon the DigiVol program of the Australian Museum and Atlas of Living Australia. DoeDat is a product of our digitization project Digital Access to Cultural Heritage Collections (DOE!), funded by the Flemish Government. DoeDat is about creating data and also, ‘Doe Dat’ means ‘do that’ in Dutch. DoeDat will help us digitize our collections, and will also give the public the chance to take an active part in the process. We aim to build a community of enthusiastic online volunteers who will help us liberate botanical data from specimen labels and documents. We launched the platform on Science Day and within two months, more than one hundred volunteers had transcribed more than 4,000 specimens. Join in at www.DoeDat.be
Specimen labels are written in numerous languages and accurate interpretation requires local knowledge of place names, vernacular names and people’s names. In many countries more than one language is in common usage. Belgium, for example, has three official languages. Crowdsourcing has helped many collections digitize their labels and generates useful data for science. Furthermore, direct engagement of the public with a herbarium increases the collection’s visibility and potentially reinforces a sense of common ownership. For these reasons we built DoeDat, a multilingual crowdsourcing platform forked from Digivol of the Australian Museum (Figs 1, 2). Some of the useful features we inherited from Digivol include a georeferencing tool, configurable templates, simple project management and individual institutional branding. Running a multilingual website does increase the work needed to setup and manage projects, but we hope to gain from the broader engagement we can attract. Currently, we are focusing our work on Belgian collections were Dutch and French are the primary languages, but in the future we may expand our languages when we work on our international collections. We also hope that we can eventually merge our code with that of Digivol, so that we can both benefit from each other's developments.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.