Large online networks are most massive and opulent data sources these days. The inherent growing demands of analyses related data fetching conflict greatly with network providers' efforts to protect their digital assets as well as users' increasing awareness of privacy. Restrictions on web interfaces of online networks prevent third party researchers from gathering sufficient data and further global images of these networks are also hidden. Under such circumstances, only techniques like random walk approaches that can run under local neighborhood access will be adopted to fulfill large online network sampling tasks. Meanwhile, the presence of highly clustered community like structure in large networks leads to random walk's poor conductance, causing intolerable and hard-to-foresee long mixing time before useful samples can be collected. With lack of techniques incorporate online network topology features being the context, in this paper we focus on taking use of community affiliation information that possibly comes with metadata when querying objects in online networks, and proposed a speeded version of random walk by raising the probability of inter-community edges being selected. Assuming the community structure is well established as promised, the community speeded random walk expects better conductance and faster convergence. Our method forces the sampler to travel rapidly among different communities that conquers the bottlenecks and thus the samples being collected are of higher quality. We also consider the scenario when community affiliation is not directly available, where we apply feature selection algorithms to select features as community.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.