Lizhu Zhou scite author profile

In the recent decades, we have witnessed the rapidly growing popularity of location-based systems. Three types of location-based queries on road networks, single-pair shortest path query, k nearest neighbor (kNN) query, and keyword-based kNN query, are widely used in location-based systems. Inspired by R-tree, we propose a height-balanced and scalable index, namely G-tree, to efficiently support these queries. The space complexity of G-tree is O(|V| log |V|) where |V| is the number of vertices in the road network. Unlike previous works that support these queries separately, G-tree supports all these queries within one framework. The basis for this framework is an assembly-based method to calculate the shortest-path distances between two vertices. Based on the assembly-based method, efficient search algorithms to answer kNN queries and keyword-based kNN queries are developed. Experiment results show G-tree's theoretical and practical superiority over existing methods.

show abstract

Effective keyword search for valuable lcas over xml documents

Feng

Wang

et al. 2007

159

135

View full text Add to dashboard Cite

In this paper, we study the problem of effective keyword search over XML documents. We begin by introducing the notion of Valuable Lowest Common Ancestor (VLCA) to accurately and effectively answer keyword queries over XML documents. We then propose the concept of Compact VLCA (CVLCA) and compute the meaningful compact connected trees rooted as CVLCAs as the answers of keyword queries. To efficiently compute CVLCAs, we devise an effective optimization strategy for speeding up the computation, and exploit the key properties of CVLCA in the design of the stack-based algorithm for answering keyword queries. We have conducted an extensive experimental study and the experimental results show that our proposed approach achieves both high efficiency and effectiveness when compared with existing proposals.

show abstract

On Graph-Based Name Disambiguation

Fan

Wang

et al. 2011

J. Data and Information Quality

110

129

View full text Add to dashboard Cite

Name ambiguity stems from the fact that many people or objects share identical names in the real world. Such name ambiguity decreases the performance of document retrieval, Web search, information integration, and may cause confusion in other applications. Due to the same name spellings and lack of information, it is a nontrivial task to distinguish them accurately. In this article, we focus on investigating the problem in digital libraries to distinguish publications written by authors with identical names. We present an effective framework named GHOST (abbreviation for GrapHical framewOrk for name diSambiguaTion), to solve the problem systematically. We devise a novel similarity metric, and utilize only one type of attribute (i.e., coauthorship) in GHOST. Given the similarity matrix, intermediate results are grouped into clusters with a recently introduced powerful clustering algorithm called Affinity Propagation . In addition, as a complementary technique, user feedback can be used to enhance the performance. We evaluated the framework on the real DBLP and PubMed datasets, and the experimental results show that GHOST can achieve both high precision and recall .

show abstract

Coherent closed quasi-clique discovery from large dense graph databases

et al. 2006

View full text Add to dashboard Cite

Frequent coherent subgraphs can provide valuable knowledge about the underlying internal structure of a graph database, and mining frequently occurring coherent subgraphs from large dense graph databases has been witnessed several applications and received considerable attention in the graph mining community recently. In this paper, we study how to efficiently mine the complete set of coherent closed quasi-cliques from large dense graph databases, which is an especially challenging task due to the downward-closure property no longer holds. By fully exploring some properties of quasicliques, we propose several novel optimization techniques, which can prune the unpromising and redundant sub-search spaces effectively. Meanwhile, we devise an efficient closure checking scheme to facilitate the discovery of only closed quasi-cliques. We also develop a coherent closed quasi-clique mining algorithm, Cocain 1 . Thorough performance study shows that Cocain is very efficient and scalable for large dense graph databases.

show abstract

Parallel community detection on large networks with propinquity dynamics

Zhang

Wang

et al. 2009

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Lizhu Zhou

G-Tree: An Efficient and Scalable Index for Spatial Search on Road Networks

Effective keyword search for valuable lcas over xml documents

On Graph-Based Name Disambiguation

Coherent closed quasi-clique discovery from large dense graph databases

Parallel community detection on large networks with propinquity dynamics

Contact Info

Product

Resources

About