2017
DOI: 10.1016/j.jpdc.2016.06.004
|View full text |Cite
|
Sign up to set email alerts
|

Distributed stream clustering using micro-clusters on Apache Storm

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
12
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
4
4

Relationship

0
8

Authors

Journals

citations
Cited by 28 publications
(12 citation statements)
references
References 26 publications
0
12
0
Order By: Relevance
“…In an IoT environment, data gathering and real-time data analysis are two prime concerns because of several data outsourcing (sensor) devices, which send small data (e.g., GPS coordinates) vs large data (e.g., surveillance videos) possibly at a very high speed. However, the current stream processing systems are not able to handle such a high-velocity data [158] and require explicit ingestion corresponding to an underlying system [159]. Hence, the existing systems in a geo-distributed IoT system cannot support multiple platforms and underlying databases.…”
Section: Concluding Remarks and Open Issuesmentioning
confidence: 99%
“…In an IoT environment, data gathering and real-time data analysis are two prime concerns because of several data outsourcing (sensor) devices, which send small data (e.g., GPS coordinates) vs large data (e.g., surveillance videos) possibly at a very high speed. However, the current stream processing systems are not able to handle such a high-velocity data [158] and require explicit ingestion corresponding to an underlying system [159]. Hence, the existing systems in a geo-distributed IoT system cannot support multiple platforms and underlying databases.…”
Section: Concluding Remarks and Open Issuesmentioning
confidence: 99%
“…If scaling is linear, a smart city could start from a three-node cluster and scale when needed to thousands of nodes and get a proportional processing boost. To choose from the plethora of solutions which are potentially useful in a smart city environment and propose the architecture, we used the datasets described in Section 4 and the criteria described in Section 5 to evaluate:Two bulk data loading solutions: Apache Sqoop [50] vs. Oracle Loader for Hadoop [51];Two streaming solutions: Spark Streaming [52] vs. Apache Storm [53];Two NoSQL databases relevant for a smart city architecture: HBase [54] vs. Cassandra [55];Two NoSQL databases using two SQL query engines: Apache Phoenix [56] vs. Presto [57];Three Hive [58] execution engines: MapReduce vs. Tez vs. Spark [59].…”
Section: System Architecture and Componentsmentioning
confidence: 99%
“…When real-time processing with latencies in milliseconds is required, Apache Storm [53] or Spark Streaming can be used. These can be useful for processing data coming from sensors, and integrate well with a distributed message system such as Apache Kafka, that can work with hundreds of megabytes per second, from multiple clients.…”
Section: System Architecture and Componentsmentioning
confidence: 99%
“…As this type of data is so big and various, it needs to be processed extremely fast and efficiently to allow final users to take advantage of it in real time. This leads us to the conclusion that traditional data processing methods which are applied to structured data will not fit unstructured spatial big data . In the subsequent section, a solution, also known as big data architecture, that was previously developed by Amini et al (2017) based on Apache Kafka for handling spatial big data in real time is presented.…”
Section: Spatial Big Data As the Future Of Road Transportmentioning
confidence: 99%