Summary
Analysis of DNA sequences is a data and computational intensive problem, and therefore, it requires suitable parallel computing resources and algorithms. In this paper, we describe our parallel algorithm for DNA sequence analysis that determines how many times a pattern appears in the DNA sequence. The algorithm is engineered for heterogeneous platforms that comprise a host with multi‐core processors and one or more many‐core devices. For combinatorial optimization, we use the simulated annealing algorithm. The optimization goal is to determine the number of threads, thread affinities, and DNA sequence fractions for host and device, such that the overall execution time of DNA sequence analysis is minimized. We evaluate our approach experimentally using real‐world DNA sequences of various organisms on a heterogeneous platform that comprises two Intel Xeon E5 processors and an Intel Xeon Phi 7120P co‐processing device. By running only about 5% of possible experiments, our optimization method finds a near‐optimal system configuration for DNA sequence analysis that yields with average speedup of 1.6 × and 2 × compared with the host‐only and device‐only execution. Copyright © 2016 John Wiley & Sons, Ltd.