Andreas Geyer-Schulz scite author profile

This paper is on a graph clustering scheme inspired by ensemble learning. In short, the idea of ensemble learning is to learn several weak classifiers and use these weak classifiers to determine a strong classifier. In this contribution, we use the generic procedure of ensemble learning and determine several weak graph clusterings (with respect to the objective function). From the partition given by the maximal overlap of these clusterings (the cluster cores), we continue the search for a strong clustering. We demonstrate the performance of this scheme by using it to maximize the modularity of a graph clustering. We show, that the quality of the initial weak clusterings is of minor importance for the quality of the final result of the scheme if we iterate the process of restarting from maximal overlaps.

show abstract

Analyzing event stream dynamics in two-mode networks: An exploratory analysis of private communication in a question and answer community

Stadtfeld

Geyer-Schulz

2011

Social Networks

View full text Add to dashboard Cite

How Symmetric Are Real-World Graphs? A Large-Scale Study

Ball

Geyer-Schulz

2018

Symmetry

View full text Add to dashboard Cite

Abstract:The analysis of symmetry is a main principle in natural sciences, especially physics. For network sciences, for example, in social sciences, computer science and data science, only a few small-scale studies of the symmetry of complex real-world graphs exist. Graph symmetry is a topic rooted in mathematics and is not yet well-received and applied in practice. This article underlines the importance of analyzing symmetry by showing the existence of symmetry in real-world graphs. An analysis of over 1500 graph datasets from the meta-repository networkrepository.com is carried out and a normalized version of the "network redundancy" measure is presented. It quantifies graph symmetry in terms of the number of orbits of the symmetry group from zero (no symmetries) to one (completely symmetric), and improves the recognition of asymmetric graphs. Over 70% of the analyzed graphs contain symmetries (i.e., graph automorphisms), independent of size and modularity. Therefore, we conclude that real-world graphs are likely to contain symmetries. This contribution is the first larger-scale study of symmetry in graphs and it shows the necessity of handling symmetry in data analysis: The existence of symmetries in graphs is the cause of two problems in graph clustering we are aware of, namely, the existence of multiple equivalent solutions with the same value of the clustering criterion and, secondly, the inability of all standard partition-comparison measures of cluster analysis to identify automorphic partitions as equivalent.

show abstract

Cluster Cores and Modularity Maximization

Ovelgönne

Geyer-Schulz

2010

View full text Add to dashboard Cite

The modularity function is a widely used measure for the quality of a graph clustering. Finding a clustering with maximal modularity is NP-hard. Thus, only heuristic algorithms are capable of processing large datasets. Extensive literature on such heuristics has been published in the recent years. We present a fast randomized greedy algorithm which uses solely local information on gradients of the objective function. Furthermore, we present an approach which first identifies the 'cores' of clusters before calculating the final clustering. The global heuristic of identifying core groups solves problems associated with pure local approaches. With the presented algorithms we were able to calculate for many realworld datasets a clustering with a higher modularity than any algorithm before.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.