2016
DOI: 10.1145/2983551
|View full text |Cite
|
Sign up to set email alerts
|

Incremental, iterative data processing with timely dataflow

Abstract: We describe the timely dataflow model for distributed computation and its implementation in the Naiad system. The model supports stateful iterative and incremental computations. It enables both low-latency stream processing and high-throughput batch processing, using a new approach to coordination that combines asynchronous and fine-grained synchronous execution. We describe two of the programming frameworks built on Naiad: GraphLINQ for parallel graph processing, and differential dataflow for nested iterative… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
19
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
4
3
1

Relationship

2
6

Authors

Journals

citations
Cited by 34 publications
(19 citation statements)
references
References 20 publications
0
19
0
Order By: Relevance
“…In particular, conditional computation, where parts of a neural network are active on a per-example basis, has been proposed as a way to increase model capacity without a proportional increase in computation; recent research has demonstrated architectures with over 100 billion parameters [38]. Also, continuous training and inference may rely on streaming systems, perhaps using the dynamic dataflow architectures that we adapt, as suggested by work on timely dataflow and differential dataflow [28]. Further research on abstractions and implementation techniques for conditional and streaming computation seems worthwhile.…”
Section: Discussionmentioning
confidence: 99%
“…In particular, conditional computation, where parts of a neural network are active on a per-example basis, has been proposed as a way to increase model capacity without a proportional increase in computation; recent research has demonstrated architectures with over 100 billion parameters [38]. Also, continuous training and inference may rely on streaming systems, perhaps using the dynamic dataflow architectures that we adapt, as suggested by work on timely dataflow and differential dataflow [28]. Further research on abstractions and implementation techniques for conditional and streaming computation seems worthwhile.…”
Section: Discussionmentioning
confidence: 99%
“…Dataflow-based computational models were proposed to perform complex analytics on high-volume data sets: the timely dataflow [69,70] model targets batch processing, while its extension, differential dataflow [65], targets incremental processing.…”
Section: Tool Descriptionmentioning
confidence: 99%
“…However, the notion of time during the calculation of the query allows this tools to support temporary state, in particular the iterate call in Listing 29. This temporary state allows to support also more complex cases such as computing connected components in Q2 using only built-in algorithms and data structures [70].…”
Section: Explicitness Of Incrementalizationmentioning
confidence: 99%
“…Specifically, research in dynamic query processing has recently received a big boost with: (1) the introduction of Higher-Order IVM (HIVM) [14,19,13]; (2) the identification of lower bounds and worst-case optimal algorithms for processing updates [3,4,12]; (3) the practical formulations of worst-case-optimal IVM that implement and extend these algorithms [10,11,20]; and (4) the introduction of the notion of differential dataflow for computations that require recursive or iterative processing [17,16]. These approaches often rely on materializing a succinct representation of a query's output to maintain it more efficiently, and therefore present a fundamental breakthrough with traditional IVM techniques.…”
Section: Introductionmentioning
confidence: 99%