2015
DOI: 10.14778/2824032.2824076
|View full text |Cite
|
Sign up to set email alerts
|

The dataflow model

Abstract: Unbounded, unordered, global-scale datasets are increasingly common in day-to-day business (e.g. Web logs, mobile usage statistics, and sensor networks). At the same time, consumers of these datasets have evolved sophisticated requirements, such as event-time ordering and windowing by features of the data themselves, in addition to an insatiable hunger for faster answers. Meanwhile, practicality dictates that one can never fully optimize along all dimensions of correctness, latency, and cost for these types of… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
74
0
2

Year Published

2018
2018
2021
2021

Publication Types

Select...
5
1
1

Relationship

1
6

Authors

Journals

citations
Cited by 402 publications
(76 citation statements)
references
References 9 publications
0
74
0
2
Order By: Relevance
“…Following the Millwheel model [4], Flink supports record timestamps and watermarks to decide when a computation can be performed. 5 Note that these semantics are different from those for the STREAM keyword proposed in 6.5.1.…”
Section: B21 Stateful and Event-time Streammentioning
confidence: 93%
See 2 more Smart Citations
“…Following the Millwheel model [4], Flink supports record timestamps and watermarks to decide when a computation can be performed. 5 Note that these semantics are different from those for the STREAM keyword proposed in 6.5.1.…”
Section: B21 Stateful and Event-time Streammentioning
confidence: 93%
“…Apache Beam [15] has recently added SQL support, developed with a careful eye towards Beam's unification of bounded and unbounded data processing [5]. Beam currently implements a subset of the semantics proposed by this paper, and many of the proposed extensions have been informed by our experiences with Beam over the years.…”
Section: Contemporary Streaming Systemsmentioning
confidence: 99%
See 1 more Smart Citation
“…The timebased model is another example of FCF windows which are often easier to understand for the final users, as they can relate the windows evolution directly to time (this is useful for business applications like in e-commerce and on-line auction systems). The use of session windows [21] emphasizes the variability of the window extents that are defined based on the frequency of input tuples. Session windows are disjoint (not-overlapped) and their extents consist in a set of consecutive input tuples whose time gap (between two consecutive tuples) does not exceed a defined inactivity gap parameter.…”
Section: A Taxonomy Of Windowing Modelsmentioning
confidence: 99%
“…Session windows are disjoint (not-overlapped) and their extents consist in a set of consecutive input tuples whose time gap (between two consecutive tuples) does not exceed a defined inactivity gap parameter. They are used when the input stream is highly irregularly distributed over the time and are useful to model the clients' behavior like in network monitoring applications [21]. Hybrid models like slide-by-tuple windows and the dual slide-by-time (i.e.…”
Section: A Taxonomy Of Windowing Modelsmentioning
confidence: 99%