Community detection is one of the most active fields in complex networks analysis, due to its potential value in practical applications. Many works inspired by different paradigms are devoted to the development of algorithmic solutions allowing to reveal the network structure in such cohesive subgroups. Comparative studies reported in the literature usually rely on a performance measure considering the community structure as a partition (Rand Index, Normalized Mutual information, etc.). However, this type of comparison neglects the topological properties of the communities. In this article, we present a comprehensive comparative study of a representative set of community detection methods, in which we adopt both types of evaluation. Community-oriented topological measures are used to qualify the communities and evaluate their deviation from the reference structure. In order to mimic realworld systems, we use artificially generated realistic networks. It turns out there is no equivalence between both approaches: a high performance does not necessarily correspond to correct topological properties, and vice-versa. They can therefore be considered as complementary, and we recommend applying both of them in order to perform a complete and accurate assessment.
Abstract. Community detection has become a very important part in complex networks analysis. Authors traditionally test their algorithms on a few real or artificial networks. Testing on real networks is necessary, but also limited: the considered real networks are usually small, the actual underlying communities are generally not defined objectively, and it is not possible to control their properties. Generating artificial networks makes it possible to overcome these limitations. Until recently though, most works used variations of the classic Erdős-Rényi random model and consequently suffered from the same flaws, generating networks not realistic enough. In this work, we use Lancichinetti et al. model, which is able to generate networks with controlled power-law degree and community distributions, to test some community detection algorithms. We analyze the properties of the generated networks and use the normalized mutual information measure to assess the quality of the results and compare the considered algorithms.
A character network is a graph extracted from a narrative, in which vertices represent characters and edges correspond to interactions between them. A number of narrative-related problems can be addressed automatically through the analysis of character networks, such as summarization, classification, or role detection. Character networks are particularly relevant when considering works of fictions (e.g. novels, plays, movies, TV series), as their exploitation allows developing information retrieval and recommendation systems. However, works of fiction possess specific properties making these tasks harder.This survey aims at presenting and organizing the scientific literature related to the extraction of character networks from works of fiction, as well as their analysis. We first describe the extraction process in a generic way, and explain how its constituting steps are implemented in practice, depending on the medium of the narrative, the goal of the network analysis, and other factors. We then review the descriptive tools used to characterize character networks, with a focus on the way they are interpreted in this context. We illustrate the relevance of character networks by also providing a review of applications derived from their analysis. Finally, we identify the limitations of the existing approaches, and the most promising perspectives.
Abstract. Community detection is a very active field in complex networks analysis, consisting in identifying groups of nodes more densely interconnected relatively to the rest of the network. The existing algorithms are usually tested and compared on real-world and artificial networks, their performance being assessed through some partition similarity measure. However, artificial networks realism can be questioned, and the appropriateness of those measures is not obvious. In this study, we take advantage of recent advances concerning the characterization of community structures to tackle these questions. We first generate networks thanks to the most realistic model available to date. Their analysis reveals they display only some of the properties observed in real-world community structures. We then apply five community detection algorithms on these networks and find out the performance assessed quantitatively does not necessarily agree with a qualitative analysis of the identified communities. It therefore seems both approaches should be applied to perform a relevant comparison of the algorithms.
Moderation of user-generated content in an online community is a challenge that has great socio-economical ramifications. However, the costs incurred by delegating this work to human agents are high. For this reason, an automatic system able to detect abuse in usergenerated content is of great interest. There are a number of ways to tackle this problem, but the most commonly seen in practice are word filtering or regular expression matching. The main limitations are their vulnerability to intentional obfuscation on the part of the users, and their context-insensitive nature. Moreover, they are language-dependent and may require appropriate corpora for training.In this paper, we propose a system for automatic abuse detection that completely disregards message content. We first extract a conversational network from raw chat logs and characterize it through topological measures. We then use these as features to train a classifier on our abuse detection task. We thoroughly assess our system on a dataset of user comments originating from a French Massively Multiplayer Online Game. We identify the most appropriate network extraction parameters and discuss the discriminative power of our features, relatively to their topological and temporal nature. Our method reaches an F -measure of 83.89 when using the full feature set, improving on existing approaches. With a selection of the most discriminative features, we dramatically cut computing time while retaining most of the performance (82.65).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.