Within the UK, livestock holdings are registered so that livestock can be traced, and animal diseases be controlled. These regulations are enforced irrespective of farm size, however, tend to be better followed on traditional farms, whereas holdings new to keeping livestock are less likely to be aware of their obligations. These smallholdings thereby may evade registration and are less likely to participate in national disease surveillance and ultimately complicate national animal disease control. Less information is known about small-scale livestock keepers, in particular those without a traditional farming background. Smallholders have been known to play a vital role in zoonotic disease outbreaks and more action needs to be taken to improve surveillance systems by incorporating this demographic into current intelligence. Literature indicates that parts of these communities often utilise social media as a means of communication and information sharing. Twitter followers from a prominent smallholder user in the UK were extracted and manually categorized as a smallholder or not, based on profile descriptions. Manual coding of just under 1,000 Twitter profiles was conducted to build a robust training dataset. Text classification algorithms were applied on this annotated data, and the resulting classification algorithms produced accuracies of over 80%. Results indicate that classification can prove to be a highly successful tool, if a sufficient training dataset is curated, and there is enough textual information within the user profiles on social media.
Web scraping and texting mining are popular computer science methods deployed by public health researchers to augment traditional epidemiological surveillance. However, within veterinary disease surveillance, such techniques are still in the early stages of development and have not yet been fully utilised. This study presents an exploration into the utility of incorporating internet-based data to better understand the smallholder farming communities within Scotland, by using online text extraction and the subsequent mining of this data. Web scraping of the livestock fora was conducted, in conjunction with text mining of the data in search of common themes, words and topics found within the text. Results from bi-grams and topic modelling uncover four main topics of interest within the data pertaining to aspects of livestock husbandry: Feeding, breeding, slaughter, and disposal. These topics were found amongst both the poultry and pig sub-forums. Topic modeling appears to be a useful method of unsupervised classification regarding this form of data, as it has produced clusters that relate to biosecurity and animal welfare. Internet data can be a very effective tool in aiding traditional veterinary surveillance methods, but the requirement for human validation of said data is crucial. This opens avenues of research via the incorporation of other dynamic social media data, namely Twitter and Facebook/Meta, in addition to time series analysis to highlight temporal patterns.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.