Young-Ho Jeon scite author profile

In order to interpret, enrich, and analyze the streaming data, stream applications often access the data stored in an external database. Although there has been a lot of studies on stream processing, little attention has been paid so far to the join between streaming data and stored data. In this paper, we propose a comprehensive solution called DS-join for distributed processing of the join under the micro-batch model of recently distributed stream processing engines (SPEs), such as spark streaming. The micro-batch model performs stream processing as a series of very small batch jobs and is more fault-tolerant in a distributed environment compared with the record-at-a-time model. The DS-join reduces the number of database accesses by using micro-batching. Furthermore, the DS-join optimizes the join operation by minimizing the data shuffling, managing a cache in a distributed SPE, parallelizing the join processing, and balancing the load between the SPE and the external database system. The experimental results using real and synthetic datasets show that, compared with the state-of-the-art methods, the DS-join significantly improves throughput, especially for large databases.INDEX TERMS Micro-batch model, distributed stream processing engine, database system, distributed join processing, cache management, spark streaming.

show abstract

Migration from RDBMS to Column-Oriented NoSQL: Lessons Learned and Open Problems

Kim

Jeon

et al. 2017

View full text Add to dashboard Cite

Techniques and guidelines for effective migration from RDBMS to NoSQL

Kim

Jeon

et al. 2018

J Supercomput

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Young-Ho Jeon

A polymeric junction membrane for solid-state reference electrodes

Distributed Join Processing Between Streaming and Stored Big Data Under the Micro-Batch Model

Migration from RDBMS to Column-Oriented NoSQL: Lessons Learned and Open Problems

Techniques and guidelines for effective migration from RDBMS to NoSQL

Contact Info

Product

Resources

About