UPCBLAS: a library for parallel matrix computations in Unified Parallel C

González-Domínguez, Jorge; Martín, María J.; Taboada, Guillermo L.; Touriño, Juan; Doallo, Ramón; Mallón, Damián A.; Wibecan, Brian

doi:10.1002/cpe.1914

Cited by 8 publications

(4 citation statements)

References 25 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…PGAS languages (such as UPC, Co-Array Fortran [27] or Titanium [28]) are often easier to use than their message passing counterparts [29,30] and can also obtain better performance by using efficient one-sided communication [31][32][33]. UPC++ combines these advantages of the PGAS model with object oriented programming.…”

Section: A Balanced On-demand Distribution Of the Reads Basedmentioning

confidence: 99%

parSRA: A framework for the parallel execution of short read aligners on compute clusters

González-Domínguez

Hundt

Schmidt

2018

Journal of Computational Science

View full text Add to dashboard Cite

Section: A Balanced On-demand Distribution Of the Reads Basedmentioning

confidence: 99%

parSRA: A framework for the parallel execution of short read aligners on compute clusters

González-Domínguez

Hundt

Schmidt

2018

Journal of Computational Science

View full text Add to dashboard Cite

“…One or more candidate vectors will be affine to each UPC thread (see fig. 4 for illustration of the memory layout); 2 Evaluate an objective function ranking the vectors in the population in parallel; 3 while Termination criteria not satisfied do 4 Each thread T in parallel 5 for i = {1, . .…”

Section: Differential Evolution In Upcmentioning

confidence: 99%

“…On the other hand, wrong communication patterns in which processes unnecessarily access non-local regions of shared memory can be obtained easily due to the simplicity of the model. The PGAS is a popular model for high performance computing (HPC) [3] implemented by a number of programming languages. Most used PGAS languages include Coarray Fortran and Unified Parallel C (UPC) [2], [4], [5].…”

Section: Introductionmentioning

confidence: 99%

Parallel Differential Evolution in Unified Parallel C

Krömer

Platoš

Snášel

2013

2013 IEEE Congress on Evolutionary Computation

View full text Add to dashboard Cite

Distributed environments and emerging highlyparallel platforms provide a suitable hardware infrastructure for parallel Evolutionary Computation. Partitioned Global Address Space model is a well-known parallel computing model used to implement scalable algorithms for many-core systems and clusters. This study investigates the Unified Parallel C programming language as a tool for implementation of scalable evolutionary algorithms for high-dimensional problems. The design concepts and initial implementation are demonstrated on the Differential Evolution algorithm. The mapping of Differential Evolution concepts to Unified Parallel C features is presented and three variants of parallel Differential Evolution for many-core shared memory systems and clusters of computers with distributed memory are implemented and evaluated in the environment of a small real-world cluster.

show abstract

“…Our parallel implementation overcomes the scalability issues of pMap thanks to an efficient use of UPC++ [ 18 ], an extension of C++ for parallel computing which has evolved from Unified Parallel C (UPC) [ 19 ]. PGAS (Partitioned Global Address Space) languages (such as UPC, Co-Array Fortran [ 20 ] or Titanium [ 21 ]) are often easier to use than message passing counterparts [ 22 , 23 ] and can also obtain better performance than them thanks to efficient one-sided communication [ 24 – 26 ]. UPC++ combines these advantages of the PGAS model and object oriented programming.…”

Section: Introductionmentioning

confidence: 99%

Parallel and Scalable Short-Read Alignment on Multi-Core Clusters Using UPC++

2016

Self Cite

View full text Add to dashboard Cite

The growth of next-generation sequencing (NGS) datasets poses a challenge to the alignment of reads to reference genomes in terms of alignment quality and execution speed. Some available aligners have been shown to obtain high quality mappings at the expense of long execution times. Finding fast yet accurate software solutions is of high importance to research, since availability and size of NGS datasets continue to increase. In this work we present an efficient parallelization approach for NGS short-read alignment on multi-core clusters. Our approach takes advantage of a distributed shared memory programming model based on the new UPC++ language. Experimental results using the CUSHAW3 aligner show that our implementation based on dynamic scheduling obtains good scalability on multi-core clusters. Through our evaluation, we are able to complete the single-end and paired-end alignments of 246 million reads of length 150 base-pairs in 11.54 and 16.64 minutes, respectively, using 32 nodes with four AMD Opteron 6272 16-core CPUs per node. In contrast, the multi-threaded original tool needs 2.77 and 5.54 hours to perform the same alignments on the 64 cores of one node. The source code of our parallel implementation is publicly available at the CUSHAW3 homepage (http://cushaw3.sourceforge.net).

show abstract

UPCBLAS: a library for parallel matrix computations in Unified Parallel C

Cited by 8 publications

References 25 publications

parSRA: A framework for the parallel execution of short read aligners on compute clusters

parSRA: A framework for the parallel execution of short read aligners on compute clusters

Parallel Differential Evolution in Unified Parallel C

Parallel and Scalable Short-Read Alignment on Multi-Core Clusters Using UPC++

Contact Info

Product

Resources

About