Reliable plant species identification from seeds is intrinsically difficult due to the scarcity of features and because it requires specialized expertise that is becoming increasingly rarer, as the number of field plant taxonomists is diminishing (Bacher 2012, Haas and Häuser 2005). On the other hand, seed identification is relevant in some science domains such as plant community ecology, archaeology, paleoclimatology. Besides, economic activities such as agriculture, require seed identification to assess weed species contained in the "soil seed banks" (Colbach 2014) to enable targeted treatments before they become a problem.
In this work, we explore and evaluate several approaches by using different training image sets with various requisites and assessing their performance with test datasets of different sources.
The core training dataset is provided by the Anthos project (Castroviejo et al. 2017) as a subset of its image collection. It consists of nearly a 1000 images of seeds identified by experts.
As identification algorithm, we will use state-of-the-art convolutional neural networks for image classification (He et al. 2016). The framework is fully written in Python using the TensorFlow (Abadi et al. 2016) module for deep learning.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.