2018
DOI: 10.14778/3229863.3229865
|View full text |Cite
|
Sign up to set email alerts
|

On optimizing operator fusion plans for large-scale machine learning in systemML

Abstract: Many large-scale machine learning (ML) systems allow specifying custom ML algorithms by means of linear algebra programs, and then automatically generate efficient execution plans. In this context, optimization opportunities for fused operators-in terms of fused chains of basic operators-are ubiquitous. These opportunities include (1) fewer materialized intermediates, (2) fewer scans of input data, and (3) the exploitation of sparsity across chains of operators. Automatic operator fusion eliminates the need fo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
21
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 43 publications
(22 citation statements)
references
References 74 publications
1
21
0
Order By: Relevance
“…Lara currently does not apply fusion of linear algebra operators and UDFs applications, as our current dense (BLAS) and sparse (Breeze) backends do not support fused operators. Future work could extend our optimizations on data layout access patterns to generate kernels for sparse linear algebra operations with UDF support and hardware-efficient code by integrating ideas from recent work [37,12,43,16]. Furthermore, one could extend the combinator view by integrating more data representations (e.g., block-wise or compressed [24]).…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…Lara currently does not apply fusion of linear algebra operators and UDFs applications, as our current dense (BLAS) and sparse (Breeze) backends do not support fused operators. Future work could extend our optimizations on data layout access patterns to generate kernels for sparse linear algebra operations with UDF support and hardware-efficient code by integrating ideas from recent work [37,12,43,16]. Furthermore, one could extend the combinator view by integrating more data representations (e.g., block-wise or compressed [24]).…”
Section: Resultsmentioning
confidence: 99%
“…The operator fusion over loops, presented in Section 4.2, detects independent tasks (e.g., encoding of distinct columns), but fuses them instead of executing them in parallel. SystemML also performs operator fusion [23,12] and generates linear algebra kernels based on skeleton classes. During a cost-based selection, the best plan with regards to fusion and caching for pipeline breakers is chosen.…”
Section: Related Workmentioning
confidence: 99%
“…4.3.1 Overall Idea. Optimal fusion plan generation requires a large search space [8,22] and has been shown to be NP-complete [15,32]. To keep the process at manageable costs, DNNFusion explores fusion plans by employing a new light-weight (greedy) approach based on our proposed Extended Computational Graph (ECG) IR and our classification of operations into mapping types.…”
Section: Light-weight Profile-driven Fusion Plan Explorationmentioning
confidence: 99%
“…VM types. Considering there are more than one hundred VM types in today's public clouds, such as Amazon, Google and Microsoft, we choose 120 enterprise-level VM types of x86 architecture from Amazon EC2 5 . Note that, in Amazon EC2, there are VM Category and VM Family on top of VM type to identify the resource characteristics.…”
Section: Evaluation 51 Experiments Setupmentioning
confidence: 99%
“…To address this challenge, existing performance modeling efforts [21,25,29] and machine learning approaches [4,18,28] have to tolerate huge offline training overhead to build an accurate online model for each framework, since they just consider low-level metrics (such as resource utilizations) within a framework. Sadly, they have to spend a lot of time to train new models for similar applications for new frameworks, although recent works [3,5,10] have proved that these similar applications, both in Hadoop and Spark, involve a wide range of use cases (micro benchmark, machine learning, stream processing and etc.). Figure 1 shows an example why we need to tolerate huge offline overhead for a new framework.…”
Section: Introductionmentioning
confidence: 99%