Executing Multiple Pipelined Data Analysis Operations in the Grid

Spencer, M. B.; Ferreira, R.; Beynon, M.; Kurç, Tahsin; Çatalyürek, Ümit V.; Sussman, A.; Saltz, Joel

doi:10.1109/sc.2002.10015

Cited by 27 publications

(32 citation statements)

References 22 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…If the computations of a given stage are independent from one data set to another, two consecutive computations (different data sets) for the same stage can be mapped onto distinct processors, thus reducing the period for the processing of this stage. Such a stage can be replicated, using the terminology of Subhlok and Vondran [27,28] and of the DataCutter team [6,7,26]. This corresponds to the dealable stages of Cole [11].…”

Section: Working Out An Examplementioning

confidence: 99%

See 1 more Smart Citation

Complexity Results for Throughput and Latency Optimization of Replicated and Data-parallel Workflows

Benoît

Robert

2008

Algorithmica

View full text Add to dashboard Cite

Mapping applications onto parallel platforms is a challenging problem, even for simple application patterns such as pipeline or fork graphs. Several antagonist criteria should be optimized for workflow applications, such as throughput and latency (or a combination). In this paper, we consider a simplified model with no communication cost, and we provide an exhaustive list of complexity results for different problem instances. Pipeline or fork stages can be replicated in order to increase the throughput by sending consecutive data sets onto different processors. In some cases, stages can also be data-parallelized, i.e. the computation of one single data set is shared between several processors. This leads to a decrease of the latency and an increase of the throughput. Some instances of this simple model are shown to be NP-hard, thereby exposing the inherent complexity of the mapping problem. We provide polynomial algorithms for other problem instances. Altogether, we provide solid theoretical foundations for the study of mono-criterion or bi-criteria mapping optimization problems.

show abstract

Section: Working Out An Examplementioning

confidence: 99%

“…Such workflows operate on a collection of data sets that are executed in [26,27,31]. Each data set is input to the application graph and traverses it until its processing is complete.…”

Section: Introductionmentioning

confidence: 99%

Complexity Results for Throughput and Latency Optimization of Replicated and Data-parallel Workflows

Benoît

Robert

2008

Algorithmica

View full text Add to dashboard Cite

show abstract

“…Benoit and Robert [11] study the theoretical complexity of latency and throughput optimization of pipeline and fork graphs with replication and data-parallelism under the assumptions of linear clustering and round-robin processing of input data items. In [3], Spencer et al presented the Filter Copy Pipeline (FCP) scheduling algorithm for optimizing latency and throughput of data analysis application DAGs on heterogeneous resources. FCP computes the number of copies of each task that is necessary to meet the aggregate production rate of its predecessors and maps the copies to processors that yield their least completion time.…”

Section: Related Workmentioning

confidence: 99%

“…This section evaluates the performance of WMSH against previously proposed schemes: Filter Copy Pipeline (FCP) [3] and EXPERT (EXploiting Pipeline Execution undeR Time constraints) [2], and FCP-e and EXPERT-e, their modified versions. When FCP fails to utilize all processors and does not meet the throughput requirement T , FCP-e recursively calls FCP on the remaining processors until T is satisfied or all processors are used.…”

Section: Performance Analysismentioning

confidence: 99%

See 1 more Smart Citation

Toward Optimizing Latency Under Throughput Constraints for Application Workflows on Clusters

Vydyanathan¹,

Çatalyürek

Kurç

et al. 2007

Euro-Par 2007 Parallel Processing

View full text Add to dashboard Cite

Abstract. In many application domains, it is desirable to meet some user-defined performance requirement while minimizing resource usage and optimizing additional performance parameters. For example, application workflows with realtime constraints may have strict throughput requirements and desire a low latency or response-time. The structure of these workflows can be represented as directed acyclic graphs of coarse-grained application tasks with data dependences. In this paper, we develop a novel mapping and scheduling algorithm that minimizes the latency of workflows that act on a stream of input data, while satisfying throughput requirements. The algorithm employs pipelined parallelism and intelligent clustering and replication of tasks to meet throughput requirements. Latency is minimized by exploiting task parallelism and reducing communication overheads. Evaluation using synthetic benchmarks and application task graphs shows that our algorithm 1) consistently meets throughput requirements even when other existing schemes fail, 2) produces lower-latency schedules, and 3) results in lesser resource usage.

show abstract

Asynchronous and Anticipatory Filter-Stream Based Parallel Algorithm for Frequent Itemset Mining

Veloso

Meira

Ferreira

et al. 2004

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Abstract. In this paper we propose a novel parallel algorithm for frequent itemset mining. The algorithm is based on the filter-stream programming model, in which the frequent itemset mining process is represented as a data flow controlled by a series of producer and consumer components (called filters), and the data flow (communication) between such filters is made via streams. When production rate matches consumption rate, and communication overhead between producer and consumer filters is minimized, a high degree of asynchrony is achieved. Following this strategy, our algorithm employs an asynchronous candidate generation, and minimizes communication between filters by transferring only the necessary aggregated information. Another nice feature of our algorithm is a look forward approach which accelerates frequent itemset determination. Extensive evaluation shows the parallel performance and scalability of our algorithm.

show abstract

Executing Multiple Pipelined Data Analysis Operations in the Grid

Cited by 27 publications

References 22 publications

Complexity Results for Throughput and Latency Optimization of Replicated and Data-parallel Workflows

Complexity Results for Throughput and Latency Optimization of Replicated and Data-parallel Workflows

Toward Optimizing Latency Under Throughput Constraints for Application Workflows on Clusters

Asynchronous and Anticipatory Filter-Stream Based Parallel Algorithm for Frequent Itemset Mining

Contact Info

Product

Resources

About