Lennart Justen scite author profile

Ticks and tick-borne diseases represent a growing public health threat in North America and Europe. The number of ticks, their geographical distribution, and the incidence of tick-borne diseases, like Lyme disease, are all on the rise. Accurate, real-time tick-image identification through a smartphone app or similar platform could help mitigate this threat by informing users of the risks associated with encountered ticks and by providing researchers and public health agencies with additional data on tick activity and geographic range. Here we outline the requirements for such a system, present a model that meets those requirements, and discuss remaining challenges and frontiers in automated tick identification. We compiled a user-generated dataset of more than 12,000 images of the three most common tick species found on humans in the U.S.: Amblyomma americanum, Dermacentor variabilis, and Ixodes scapularis. We used image augmentation to further increase the size of our dataset to more than 90,000 images. Here we report the development and validation of a convolutional neural network which we call “TickIDNet,” that scores an 87.8% identification accuracy across all three species, outperforming the accuracy of identifications done by a member of the general public or healthcare professionals. However, the model fails to match the performance of experts with formal entomological training. We find that image quality, particularly the size of the tick in the image (measured in pixels), plays a significant role in the network’s ability to correctly identify an image: images where the tick is small are less likely to be correctly identified because of the small object detection problem in deep learning. TickIDNet’s performance can be increased by using confidence thresholds to introduce an “unsure” class and building image submission pipelines that encourage better quality photos. Our findings suggest that deep learning represents a promising frontier for tick identification that should be further explored and deployed as part of the toolkit for addressing the public health consequences of tick-borne diseases.

show abstract

No Time like the Present: Effects of Language Change on Automated Comment Moderation

Justen

Müller²,

Niemann³

et al. 2022

View full text Add to dashboard Cite

Identification of public submitted tick images: a neural network approach

Justen

Carlsmith

Paskewitz

et al. 2021

Preprint

View full text Add to dashboard Cite

Ticks and tick-borne diseases represent a growing public health threat in North America and Europe. The number of ticks, their geographical distribution, and the incidence of tick-borne diseases, like Lyme disease, are all on the rise. Accurate, real-time tick-image identification through a smartphone app or similar platform could help mitigate this threat by informing users of the risks associated and by providing researchers and public health agencies with better data on tick activity and geographic range. We report the development and validation of a convolutional neural network, a type of deep learning algorithm, trained on a dataset of more than 12,000 user-generated tick images. The model, which we call 'TickIDNet', is trained to identify the three most common tick species found on humans in the U.S.: Amblyomma americanum, Dermacentor variabilis, and Ixodes scapularis. At baseline, TickIDNet scores an 87.8% identification accuracy across all three species, outperforming the accuracy of identifications done by a member of the general public or healthcare professionals. However, the model fails to match the performance of experts with formal entomological training. We find that image quality, particularly the size of the tick in the image (measured in pixels), plays a significant role in the network's ability to correctly identify an image: images where the tick is small are less likely to be correctly identified because of the small object detection problem in deep learning. TickIDNet's performance can be increased by using confidence thresholds to introduce an 'unsure' class and building image submission pipelines that encourage better quality photos. Our findings suggest that deep learning represents a promising frontier for tick identification that should be further explored and deployed as part of the toolkit for addressing the public health consequences of tick-borne diseases.

show abstract

No Time Like the Present: Effects of Language Change on Automated Comment Moderation

Justen¹,

Müller²,

Niemann³

et al. 2022

Preprint

View full text Add to dashboard Cite

The spread of online hate has become a major problem for newspapers that host comment sections. As a result, there is growing interest in using machine learning (ML) and natural language processing (NLP) for (semi-) automated abusive language detection to avoid manual comment moderation costs or having to shut down comment sections all together. However, much of the past work on abusive language detection with ML uses random train-test splitting procedures that assume an unrealistically static language environment. In this paper, we show using a new German newspaper comments dataset that a time-stratified evaluation procedure provides a more realistic measure of a classifier's performance on future data. We also show that the performance of classifiers can degrade quickly as the training data grows more outdated and language and news coverage evolve. Further, we demonstrate that the performance of classifiers trained on data from before the COVID-19 pandemic drops sharply when evaluated on COVID-era comments. Our findings suggest that when standard ML techniques are applied naively to abusive language detection, a classifier will fail to meet the advertised evaluation benchmarks in the real-world environment.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Lennart Justen

Identification of public submitted tick images: A neural network approach

No Time like the Present: Effects of Language Change on Automated Comment Moderation

Identification of public submitted tick images: a neural network approach

No Time Like the Present: Effects of Language Change on Automated Comment Moderation

Contact Info

Product

Resources

About