Task-based programming models have demonstrated their efficiency in the development of scientific applications on modern high-performance platforms. They allow delegation of the management of parallelization to the runtime system (RS), which is in charge of the data coherency, the scheduling, and the assignment of the work to the computational units. However, some applications have a limited degree of parallelism such that no matter how efficient the RS implementation, they may not scale on modern multicore CPUs. In this paper, we propose using speculation to unleash the parallelism when it is uncertain if some tasks will modify data, and we formalize a new methodology to enable speculative execution in a graph of tasks. This description is partially implemented in our new C++ RS called SPETABARU, which is capable of executing tasks in advance if some others are not certain to modify the data. We study the behavior of our approach to compute Monte Carlo and replica exchange Monte Carlo simulations.
Subjects Distributed and Parallel ComputingHow to cite this article Bramas B. 2019. Increasing the degree of parallelism using speculative execution in task-based runtime systems. PeerJ Comput. Sci. 5:e183 http://doi.org/10.7717/peerj-cs.183 Bramas (2019), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.183 2/25 Bramas (2019), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.183 3/25 ALGORITHM 2: Replica Exchange Monte Carlo (parallel tempering) simulation algorithm. 1 function REMC(domains[N], temperature[N]) 2 // Compute energy (particle to particle interactions) 3 for s from 1 to N do 4 energy[s] ← compute_energy(domains[s]) 5 end 6 // Iterate for a given number of iterations 7 for iter from 1 to NB_LOOPS_REMC do 8 for s from 1 to N do 9 // Compute usual MC for each simulation 10 MC_Core(domains[s], temperature[s], energy[s]) 11 end 12 // Compare based on a given strategy 13 for s in exchange_list(iter) do 14 // Use the energy difference between s and s+1 to decide to exchange them 15 if random_01() ≤ metropolis(energy[s] -energy[s+1], temperatures[s]) then 16 swap(domains[s], domains[s+1]) 17 swap(energy[s], energy[s+1]) 18 end 19 end 20 end Bramas (2019), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.183 6/25 Thachuk C, Shmygelska A, Hoos HH. 2007. A replica exchange Monte Carlo algorithm for protein folding in the HP model. BMC Bioinformatics 8(1):342 Nikolopoulos DS. 2018. A taxonomy of task-based parallel programming technologies for highperformance computing. The Journal of Supercomputing 74(4):1422-1434. Tillenius M. 2015. Superglue: a shared memory framework using data versioning for dependency-aware task-based parallelization. SIAM Journal on Scientific Computing 37(6):C617-C642 X. 2013. Parallel metropolis coupled Markov chain Monte Carlo for isolation with migration model. Applied Mathematics & Information Sciences 7(1L):219-224 DOI 10.12785/amis/071L30.