2017
DOI: 10.1109/tbdata.2017.2666201
|View full text |Cite
|
Sign up to set email alerts
|

A Distributed Stream Library for Java 8

Abstract: An increasingly popular application of parallel computing is Big Data, which concerns the storage and analysis of very large datasets. Many of the prominent Big Data frameworks are written in Java or JVM-based languages. However, as a base language for Big Data systems, Java still lacks a number of important capabilities such as processing very large datasets and distributing the computation over multiple machines. The introduction of Streams in Java 8 has provided a useful programming model for data-parallel … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
6
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
5

Relationship

0
5

Authors

Journals

citations
Cited by 7 publications
(6 citation statements)
references
References 36 publications
0
6
0
Order By: Relevance
“…Big data technologies such as mapReduce help in thriving agile businesses [24] and therefore the test-bed has been implemented for the mapReduce operation. Another reason for choosing the mapReduce is the Java Stream API which was introduced in the 8th version of Java and it simplifies the implementation of mapReduce operation [25]. This section explains the evaluation scenario along with the discussion on experimental results.…”
Section: Experimentation and Resultsmentioning
confidence: 99%
“…Big data technologies such as mapReduce help in thriving agile businesses [24] and therefore the test-bed has been implemented for the mapReduce operation. Another reason for choosing the mapReduce is the Java Stream API which was introduced in the 8th version of Java and it simplifies the implementation of mapReduce operation [25]. This section explains the evaluation scenario along with the discussion on experimental results.…”
Section: Experimentation and Resultsmentioning
confidence: 99%
“…As per literature, set of solutions are available for handling big data at each level, 2,[6][7][8][11][12][13]46,47 including domain-specific fourth-generation languages, in-memory databases, data compression techniques, programming models, third party libraries, accelerator hardwares, and storage devices. 8,45,[48][49][50][51][52][53][54] Each time a big data problem emerges, a large-scale dedicated solution is considered for handling it, which requires huge computing and personnel resources, substantially raising the overall project expenditures. The higher layers provide more abstraction and usability at the cost of reduced overall performance, 8,55 thus big data solutions are frequently provided at the higher levels, ignoring the performance opportunities at lower layers as discussed below.…”
Section: Big Data Processing Stackmentioning
confidence: 99%
“…It is less abstracted in comparison to the application level, the design and management tasks are more difficult for end programmers. The progress is significant by means of domain‐specific programming models and third‐party libraries 8,45,50‐52 …”
Section: Background and Motivationmentioning
confidence: 99%
“…It is less abstracted in comparison to the application level, the design and management tasks are more difficult for end programmers. The growth is significant by means of programming models and third party libraries [25]- [27].…”
Section: ) Middleware and Management Levelmentioning
confidence: 99%
“…The detection stage can automatically trigger the specialized 3Vs optimizations, only if the application is big data, otherwise routine processing is continued. The 3Vs optimizations can be incorporated at various levels of big data stack such as hardware (GPGPUs, FPGAs) [20], [21], compiler (garbage collection, parallelization, loop, type inferencing, data layout optimizations) [22]- [24], third party libraries (Hadoop, Spark, Flink, Storm) [25]- [27], databases (MongoDB, VoltDB) [20], etc. Several benefits of automatic big data detection include optimal computational resource utilization, selection of appropriate tools, triggering of relevant codes either for General Purpose Processors (GPPs) or varying accelerator architectures, minimal overhead & user intervention.…”
Section: Introductionmentioning
confidence: 99%