Hyeonbyeong Lee scite author profile

Various distributed processing schemes were studied to efficiently utilize a large scale of RDF graph in semantic web services. This paper proposes a new distributed SPARQL query processing scheme considering communication costs in Spark environments to reduce I/O costs during SPARQL query processing. We divide a SPARQL query into several subqueries using a WHERE clause to process a query of an RDF graph stored in a distributed environment. The proposed scheme reduces data communication costs by grouping the divided subqueries in related nodes through the index and processing them, and the grouped subqueries calculate the cost of all possible query execution paths to select an efficient query execution path. The efficient query execution path is selected through the algorithm considering the data parsing cost of all possible query execution paths, amount of data communication, and queue time per node. It is shown through various performance evaluations that the proposed scheme outperforms the existing schemes.

show abstract

Cost Model Based Incremental Processing in Dynamic Graphs

Bok

Cho

Lee

et al. 2022

Electronics

View full text Add to dashboard Cite

Incremental graph processing has been developed to reduce unnecessary redundant calculations in dynamic graphs. In this paper, we propose an incremental dynamic graph-processing scheme using a cost model to selectively perform incremental processing or static processing. The cost model calculates the predicted values of the detection cost and processing cost of the recalculation region based on the past processing history. If there is a benefit of the cost model, incremental query processing is performed. Otherwise, static query processing is performed because the detection cost and processing cost increase due to the graph change. The proposed incremental scheme reduces the amount of computation by processing only the changed region through incremental processing. Further, it reduces the detection and disk I/O costs of the vertex, which are calculated by reusing the subgraphs from the previous results. The processing structure of the proposed scheme stores the data read from the cache and the adjacent vertices and then performs only memory mapping when processing these graph. It is demonstrated through various performance evaluations that the proposed scheme outperforms the existing schemes.

show abstract

Path Based Subgraph Searching in Distributed Environments

Bok

Kim

Lee

et al. 2023

View full text Add to dashboard Cite

k-NN Query Optimization for High-Dimensional Index Using Machine Learning

et al. 2023

View full text Add to dashboard Cite

In this study, we propose three k-nearest neighbor (k-NN) optimization techniques for a distributed, in-memory-based, high-dimensional indexing method to speed up content-based image retrieval. The proposed techniques perform distributed, in-memory, high-dimensional indexing-based k-NN query optimization: a density-based optimization technique that performs k-NN optimization using data distribution; a cost-based optimization technique using query processing cost statistics; and a learning-based optimization technique using a deep learning model, based on query logs. The proposed techniques were implemented on Spark, which supports a master/slave model for large-scale distributed processing. We showed the superiority and validity of the proposed techniques through various performance evaluations, based on high-dimensional data.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Hyeonbyeong Lee

Design and implementation of an academic expert system through big data analysis

An Efficient Distributed SPARQL Query Processing Scheme Considering Communication Costs in Spark Environments

Cost Model Based Incremental Processing in Dynamic Graphs

Path Based Subgraph Searching in Distributed Environments

k-NN Query Optimization for High-Dimensional Index Using Machine Learning

Contact Info

Product

Resources

About