2020 IEEE Third International Conference on Data Stream Mining &Amp; Processing (DSMP) 2020
DOI: 10.1109/dsmp47368.2020.9204304
|View full text |Cite
|
Sign up to set email alerts
|

Spark Structured Streaming: Customizing Kafka Stream Processing

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 9 publications
(4 citation statements)
references
References 8 publications
0
4
0
Order By: Relevance
“…A. Saraswathi et al [32] also used Kafka and Spark to predict road traffic in real-time. Y. Drohobytskiy et al [33] developed a real-time multi-party data exchange using Apache Spark to obtain data from Apache Kafka, process it and store it in HDFS.…”
Section: Data Processmentioning
confidence: 99%
“…A. Saraswathi et al [32] also used Kafka and Spark to predict road traffic in real-time. Y. Drohobytskiy et al [33] developed a real-time multi-party data exchange using Apache Spark to obtain data from Apache Kafka, process it and store it in HDFS.…”
Section: Data Processmentioning
confidence: 99%
“…Drohobytskiy et al [5] demonstrate customizing Kafka stream processing using Spark Structured Streaming. Conditional monitoring procedures to process irregular data streams efficiently are shown.…”
Section: Literature Surveymentioning
confidence: 99%
“…Apache Kafka is a distributed streaming platform that is designed for high-throughput, fault-tolerant, and scalable data streaming [10]. Kafka is widely used for building realtime data pipelines and streaming applications, such as log aggregation, event-driven architectures, and stream processing [22].…”
Section: Apache Kafkamentioning
confidence: 99%
“…Additionally, the function configures the Spark object with necessary packages and settings, such as the Kafka and PostgreSQL connectors, to allow seamless integration with these external components. It also ensures that the streaming process can be stopped gracefully when required [10]. Upon successful creation of the Spark object, it is returned to be used in the data ingestion pipeline.…”
Section: B Consuming Data From Kafkamentioning
confidence: 99%