Automated location invariant animal detection in camera trap images using publicly available data sources

Shepley, Andrew; Falzon, Greg; Meek, Paul D.; Kwan, Paul

doi:10.1002/ece3.7344

Cited by 24 publications

(17 citation statements)

References 44 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Another approach to enhance generalisation is to infuse camera trap data with imagery from other sources, such as Flickr 1 and iNaturalist 2 . Shepley et al [20] details this work, showing that infusing camera trap data with up to 15% of imagery from other sources shows mAP performance increases of 3.66% to 18.20%, although improvements plateaued or decreased when infusion proportions increased further.…”

Section: B Object Detection and Image Classificationmentioning

confidence: 75%

“…In a bid to improve model performance (and indirectly overcome the generalisation problem), research progressed to training object detection models to first localise an animal within an image, then classify said animal [2], [16], [20]. This approach appeared to improve performance over whole image classifiers, with Tabak et al [2] showing top-1 classification accuracies (the predicted label with the highest confidence being correct) improving from 79.18% to 91.86% for full image and object detection bounding box cropped images respectively -testing was conducted on the same dataset.…”

Section: B Object Detection and Image Classificationmentioning

confidence: 99%

“…An approach suggested by Shepley et al [20] for additional research, as well as provided as an option in the 2021 iWildcam competition, was the use of image segmentation to overcome issues around background imagery causing generalisation issues. Image segmentation is the process in which a statistical or deep learning model attempts to separate an image in the foreground from background imagery.…”

Section: Image Segmentationmentioning

confidence: 99%

See 2 more Smart Citations

Application of deep learning to camera trap data for ecologists in planning / engineering -- Can captivity imagery train a model which generalises to the wild?

Curry¹,

Trotter²,

McGough³

2021

Preprint

View full text Add to dashboard Cite

Understanding the abundance of a species is the first step towards understanding both its long-term sustainability and the impact that we may be having upon it. Ecologists use camera traps to remotely survey for the presence of specific animal species. Previous studies have shown that deep learning models can be trained to automatically detect and classify animals within camera trap imagery with high levels of confidence. However, the ability to train these models is reliant upon having enough high-quality training data. What happens when the animal is rare or the data sets are non-existent? This research proposes an approach of using images of rare animals in captivity (we focus on the Scottish wildcat) to generate the training dataset. We explore the challenges associated with generalising a model trained on captivity data when applied to data collected in the wild. The research is contextualised by the needs of ecologists in planning / engineering. Following precedents from other research, this project establishes an ensemble system of object detection, image segmentation and image classification models which are then tested using different image manipulation and class structuring techniques to encourage model generalisation. The research concludes, in the context of Scottish wildcat, that models trained on captivity imagery cannot be generalised to wild camera trap imagery using existing techniques. However, final model performances based on a two-class model (Wildcat vs Not Wildcat) achieved an overall accuracy score of 81.6% and Wildcat accuracy score of 54.8% on a test set in which only 1% of images contained a wildcat. This suggests using captivity images is feasible with further research. This is the first research which attempts to generate a training set based on captivity data and the first to explore the development of such models in the context of ecologists in planning / engineering.

show abstract

Section: B Object Detection and Image Classificationmentioning

confidence: 75%

Section: B Object Detection and Image Classificationmentioning

confidence: 99%

Section: Image Segmentationmentioning

confidence: 99%

See 1 more Smart Citation

Application of deep learning to camera trap data for ecologists in planning / engineering -- Can captivity imagery train a model which generalises to the wild?

Curry¹,

Trotter²,

McGough³

2021

Preprint

View full text Add to dashboard Cite

show abstract

“…Gomez et al [24] used ResNet101 to achieve a binary (birds vs. no birds) accuracy of 97.5% and multi-class (bird species) accuracy of 90.23%. Others, like Beery et al [25] and Shepley et al [26], used object detection techniques to eliminate non-animal images before classification. An object detection approach proposed by Wei et al [27] outperformed MLWIC [16].…”

Section: Animal Image Classification Using Convolutional Neural Networkmentioning

confidence: 99%

An IoT System Using Deep Learning to Classify Camera Trap Images on the Edge

Zualkernan

Dhou

Judas³

et al. 2022

Computers

View full text Add to dashboard Cite

Camera traps deployed in remote locations provide an effective method for ecologists to monitor and study wildlife in a non-invasive way. However, current camera traps suffer from two problems. First, the images are manually classified and counted, which is expensive. Second, due to manual coding, the results are often stale by the time they get to the ecologists. Using the Internet of Things (IoT) combined with deep learning represents a good solution for both these problems, as the images can be classified automatically, and the results immediately made available to ecologists. This paper proposes an IoT architecture that uses deep learning on edge devices to convey animal classification results to a mobile app using the LoRaWAN low-power, wide-area network. The primary goal of the proposed approach is to reduce the cost of the wildlife monitoring process for ecologists, and to provide real-time animal sightings data from the camera traps in the field. Camera trap image data consisting of 66,400 images were used to train the InceptionV3, MobileNetV2, ResNet18, EfficientNetB1, DenseNet121, and Xception neural network models. While performance of the trained models was statistically different (Kruskal–Wallis: Accuracy H(5) = 22.34, p < 0.05; F1-score H(5) = 13.82, p = 0.0168), there was only a 3% difference in the F1-score between the worst (MobileNet V2) and the best model (Xception). Moreover, the models made similar errors (Adjusted Rand Index (ARI) > 0.88 and Adjusted Mutual Information (AMU) > 0.82). Subsequently, the best model, Xception (Accuracy = 96.1%; F1-score = 0.87; F1-Score = 0.97 with oversampling), was optimized and deployed on the Raspberry Pi, Google Coral, and Nvidia Jetson edge devices using both TenorFlow Lite and TensorRT frameworks. Optimizing the models to run on edge devices reduced the average macro F1-Score to 0.7, and adversely affected the minority classes, reducing their F1-score to as low as 0.18. Upon stress testing, by processing 1000 images consecutively, Jetson Nano, running a TensorRT model, outperformed others with a latency of 0.276 s/image (s.d. = 0.002) while consuming an average current of 1665.21 mA. Raspberry Pi consumed the least average current (838.99 mA) with a ten times worse latency of 2.83 s/image (s.d. = 0.036). Nano was the only reasonable option as an edge device because it could capture most animals whose maximum speeds were below 80 km/h, including goats, lions, ostriches, etc. While the proposed architecture is viable, unbalanced data remain a challenge and the results can potentially be improved by using object detection to reduce imbalances and by exploring semi-supervised learning.

show abstract

“…In addition, with the wide application of camera trap surveys, the size of datasets increases rapidly, and the data preprocessing obstacle brought by images with no wildlife in them becomes more and more prominent [ 19 , 20 ]. Cost-effective technologies are urgently needed to aid in ecological monitoring [ 21 , 22 ].…”

Section: Introductionmentioning

confidence: 99%

Animal Detection and Classification from Camera Trap Images Using Different Mainstream Object Detection Architectures

Tan

Chao

Cheng

et al. 2022

Animals

View full text Add to dashboard Cite

Camera traps are widely used in wildlife surveys and biodiversity monitoring. Depending on its triggering mechanism, a large number of images or videos are sometimes accumulated. Some literature has proposed the application of deep learning techniques to automatically identify wildlife in camera trap imagery, which can significantly reduce manual work and speed up analysis processes. However, there are few studies validating and comparing the applicability of different models for object detection in real field monitoring scenarios. In this study, we firstly constructed a wildlife image dataset of the Northeast Tiger and Leopard National Park (NTLNP dataset). Furthermore, we evaluated the recognition performance of three currently mainstream object detection architectures and compared the performance of training models on day and night data separately versus together. In this experiment, we selected YOLOv5 series models (anchor-based one-stage), Cascade R-CNN under feature extractor HRNet32 (anchor-based two-stage), and FCOS under feature extractors ResNet50 and ResNet101 (anchor-free one-stage). The experimental results showed that performance of the object detection models of the day-night joint training is satisfying. Specifically, the average result of our models was 0.98 mAP (mean average precision) in the animal image detection and 88% accuracy in the animal video classification. One-stage YOLOv5m achieved the best recognition accuracy. With the help of AI technology, ecologists can extract information from masses of imagery potentially quickly and efficiently, saving much time.

show abstract

Automated location invariant animal detection in camera trap images using publicly available data sources

Abstract: This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.

Cited by 24 publications

References 44 publications

Application of deep learning to camera trap data for ecologists in planning / engineering -- Can captivity imagery train a model which generalises to the wild?

Application of deep learning to camera trap data for ecologists in planning / engineering -- Can captivity imagery train a model which generalises to the wild?

An IoT System Using Deep Learning to Classify Camera Trap Images on the Edge

Animal Detection and Classification from Camera Trap Images Using Different Mainstream Object Detection Architectures

Contact Info

Product

Resources

About