Online social networks (OSNs) have become important platforms for efficiently connecting people and promoting information dissemination, which is of great importance to our social life and society. However, due to privacy concerns, and access limitations, it is difficult to obtain the whole network of OSNs for analysis, so it is critical to have a representative subgraph. Yet due to the same reasons, we are in lack of the original network as the ground truth which poses great challenges on evaluating sampling methods on the performance on unbiasedness, let alone representativeness. Thus uniform sampling (UNI) [Gjoka et al. 2010] was proposed to obtain an unbiased nodal property distribution as of the original network to evaluate the degree of bias of other methods. Yet UNI sampling suffers from its low efficiency, and the representativeness and connectivity of the obtained subgraph, which is formed by the sampled nodes and connections between them, are rarely studied. We propose an adaptive UNI sampling (adpUNI) method to overcome previously mentioned disadvantages of UNI by dividing the userID space into several intervals, whose sampling probability adaptively changes based on its target rate. By adding its neighbors of the targeted node into the sample set (adpUNI+N), we can further improve the performance on sampling efficiency and obtain a more connective and representative subgraph. When applied to Sina Weibo and Twitter, our methods over-perform other classical methods on sampling efficiency, and always have a better performance on connectivity and representativeness than UNI sampling. And we also find that an unbiased sample doesn't guarantee a more representative subgraph.INDEX TERMS Complex networks, representative sampling, large-scale online social networks, UNI, adaptive method.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.