Unlocking avian museum collections to enable and advance environmental change research

Norris, Ken; Bond, Alexander L.; Cooper, Joanna H.; Adams, Mark P.; van Grouw, Hein; White, Judith; Stervander, Martin; Russell, Douglas G. D.; Loader, Simon P.

doi:10.1111/ibi.13271

Ibis

2023

DOI: 10.1111/ibi.13271

|View full text |Cite

Unlocking avian museum collections to enable and advance environmental change research

Ken Norris,

Alexander L. Bond,

Joanna H. Cooper

et al.

Abstract: The rate and magnitude of contemporary changes in natural systems is unprecedented in the Earth's history. Studies of wild birds have been critically important in helping us understand and address these environmental changes. Avian collections provide a potentially unique perspective on change through time, but their role in environmental change research is limited by the availability of collections data. Here we describe how avian collections might be unlocked to enable environmental change research, and disc… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

Supporting

Mentioning

Contrasting

Year Published

2024

Publication Types

Select...

Article2

Relationship

Self Cite0

Independent2

Authors

Journals

Cited by 2 publications

References 49 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

Extinct and endangered (‘E&E’) birds in the ornithological collection of Le Beffroi Musée Boucher de Perthes-Manessier, Abbeville, France

Gouraud,

Jansen

2024

Bulletin of the British Ornithologists’ Club

View full text Add to dashboard Cite

Extinct and endangered (‘E&E’) birds in the ornithological collection of Le Beffroi Musée Boucher de Perthes-Manessier, Abbeville, France

Gouraud,

Jansen

2024

Bulletin of the British Ornithologists’ Club

View full text Add to dashboard Cite

Unscrambling the Eggs: Automated Data Extraction from Structured Record Cards

Salili-James,

Scott,

Russell

et al. 2024

BISS

View full text Add to dashboard Cite

The Natural History Museum in the UK (NHM) is home to more than 80 million objects spanning 4.5 billion years of history. Each of these contain a wealth of data, whether on specimen labels, index cards, registers and/or diaries. Transcribing and categorising this information can help unlock crucial research potential. To do this at scale, we turn to computer vision (CV) and Machine Learning (ML) techniques to automate this work. Over a million of the museum’s specimens are ornithological, including one of the largest and most comprehensive egg collections in the world. Representing 52% of known bird species, with over 300,000 clutches (where a clutch defines the total group of eggs laid in a nest), collected over the last 200 years, arguably make this the most important archive of avian environmental change data in existence(Norris et al. 2023). The eggs were historically catalogued using index cards, containing key information such as identification, collection date, locality and clutch size. A proportion of these egg cards have now been imaged and this led to the start of this project, focusing on a sample of 15,000 photographed egg cards (example seen in Fig. 1). Our initial approach used Google Vision to perform Optical Character Recognition (OCR) to transcribe all text with the egg cards. By focusing on textboxes around key terms (e.g., “Collector”), and using CV tools, we approximated boxes around every key category. Finally, each text segment was associated to a category box, followed by minor post-processing in order to extract (i.e., transcribe and categorise) the data. Here we successfully extracted the data within the sample, with a 98.6% average accuracy. Although our methods worked well for our sample, they did rely on consistency within the structures of cards. To expand the project further, and to mitigate the reliance on consistent structures within cards, we turned to Large Language Models (LLMs). This allowed us to explore automatic data extraction from different types of cards and labels, despite variation in the card structure, and even handle unknown categories of text. Consequently, the scope of the data collected was widened, such as adding ornithological specimen data (e.g., skins), as well as external datasets through collaboration with the British Trust for Ornithology, who manage the Nest Record Scheme (Crick et al. 2003), which holds decades of vital information on the progress of monitored nests in the UK. This index-card data-extraction project is just the beginning. As we expand our data extraction capabilities, our aim is to develop a novel pipeline that can be applied not just to avifauna-related cards, but any structured textual data, with the potential to unlock invaluable insights.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Unlocking avian museum collections to enable and advance environmental change research

Cited by 2 publications

References 49 publications

Extinct and endangered (‘E&E’) birds in the ornithological collection of Le Beffroi Musée Boucher de Perthes-Manessier, Abbeville, France

Extinct and endangered (‘E&E’) birds in the ornithological collection of Le Beffroi Musée Boucher de Perthes-Manessier, Abbeville, France

Unscrambling the Eggs: Automated Data Extraction from Structured Record Cards

Contact Info

Product

Resources

About