As recent events have demonstrated, disinformation spread through social networks can have dire political, economic and social consequences. Detecting disinformation must inevitably rely on the structure of the network, on users particularities and on event occurrence patterns. We present a graph data structure, which we denote as a meta-graph, that combines underlying users' relational event information, as well as semantic and topical modeling. We detail the construction of an example meta-graph using Twitter data covering the 2016 US election campaign and then compare the detection of disinformation at cascade level, using well-known graph neural network algorithms, to the same algorithms applied on the meta-graph nodes. The comparison shows a consistent 3-4% improvement in accuracy when using the meta-graph, over all considered algorithms, compared to basic cascade classification, and a further 1% increase when topic modeling and sentiment analysis are considered. We carry out the same experiment on two other datasets, HealthRelease and HealthStory, part of the FakeHealth dataset repository, with consistent results. Finally, we discuss further advantages of our approach, such as the ability to augment the graph structure using external data sources, the ease with which multiple meta-graphs can be combined as well as a comparison of our method to other graph-based disinformation detection frameworks.
Domain classification services have applications in multiple areas, including cybersecurity, content blocking, and targeted advertising. Yet, these services are often a black box in terms of their methodology to classifying domains, which makes it difficult to assess their strengths, aptness for specific applications, and limitations. In this work, we perform a large-scale analysis of 13 popular domain classification services on more than 4.4M hostnames. Our study empirically explores their methodologies, scalability limitations, label constellations, and their suitability to academic research as well as other practical applications such as content filtering. We find that the coverage varies enormously across providers, ranging from over 90% to below 1%. All services deviate from their documented taxonomy, hampering sound usage for research. Further, labels are highly inconsistent across providers, who show little agreement over domains, making it difficult to compare or combine these services. We also show how the dynamics of crowd-sourced efforts may be obstructed by scalability and coverage aspects as well as subjective disagreements among human labelers. Finally, through case studies, we showcase that most services are not fit for detecting specialized content for research or content-blocking purposes. We conclude with actionable recommendations on their usage based on our empirical insights and experience. Particularly, we focus on how users should handle the significant disparities observed across services both in technical solutions and in research. CCS CONCEPTS• Networks → Network measurement; • Information systems → Clustering and classification; Web applications; Web searching and information discovery.ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.
The automatic classification of fish species appearing in images and videos from underwater cameras is a challenging task, albeit one with a large potential impact in environment conservation, marine fauna health assessment, and fishing policy. Deep neural network models, such as convolutional neural networks, are a popular solution to image recognition problems. However, such models typically require very large datasets to train millions of model parameters. Because underwater fish image and video datasets are scarce, non-uniform, and often extremely unbalanced, deep neural networks may be inadequately trained, and undergo a much larger risk of overfitting. In this paper, we propose small convolutional neural networks as a practical engineering solution that helps tackle fish image classification. The concept of “small” refers to the number of parameters of the resulting models: smaller models are lighter to run on low-power devices, and drain fewer resources per execution. This is especially relevant for fish recognition systems that run unattended on offshore platforms, often on embedded hardware. Here, established deep neural network models would require too many computational resources. We show that even networks with little more than 12,000 parameters provide an acceptable working degree of accuracy in the classification task (almost 42% for six fish species), even when trained on small and unbalanced datasets. If the fish images come from videos, we augment the data via a low-complexity object tracking algorithm, increasing the accuracy to almost 49% for six fish species. We tested the networks with images obtained from the deployments of an experimental system in the Mediterranean sea, showing a good level of accuracy given the low quality of the dataset.
The task of visual classification, done until not long ago by specialists through direct observation, has recently benefited from advancements in the field of computer vision, specifically due to statistical optimization algorithms, such as deep neural networks. In spite of their many advantages, these algorithms require a considerable amount of training data to produce meaningful results. Another downside is that neural networks are usually computationally demanding algorithms, with millions (if not tens of millions) of parameters, which restricts their deployment on low-power embedded field equipment.In this paper, we address the classification of multiple species of pelagic fish by using small convolutional networks to process images as well as videos frames. We show that such networks, even with little more than 12,000 parameters and trained on small datasets, provide relatively high accuracy (almost 42% for six fish species) in the classification task. Moreover, if the fish images come from videos, we deploy a simple object tracking algorithm to augment the data, increasing the accuracy to almost 49% for six fish species. The small size of our convolutional networks enables their deployment on relatively limited devices.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.