2013
DOI: 10.1007/978-3-642-40450-4_15
|View full text |Cite
|
Sign up to set email alerts
|

Parallel String Sample Sort

Abstract: We discuss how string sorting algorithms can be parallelized on modern multi-core shared memory machines. As a synthesis of the best sequential string sorting algorithms and successful parallel sorting algorithms for atomic objects, we propose string sample sort. The algorithm makes effective use of the memory hierarchy, uses additional word level parallelism, and largely avoids branch mispredictions. Additionally, we parallelize variants of multikey quicksort and radix sort that are also useful in certain sit… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
37
0

Year Published

2015
2015
2022
2022

Publication Types

Select...
4
3

Relationship

1
6

Authors

Journals

citations
Cited by 18 publications
(37 citation statements)
references
References 14 publications
0
37
0
Order By: Relevance
“…The radix sort implementation by Rantala performs the best for most of the data sets when compared with the other state‐of‐the‐art string sorting algorithms. For the artificial data set, burst sort outperforms both the serial MKQS and the radix sort implemented by Rantala by a margin of 4.99% and 13.17%. MKQS performs very poorly in its serial version, the main reason being its inherent recursive nature.…”
Section: Experimental Results and Performance Analysismentioning
confidence: 99%
“…The radix sort implementation by Rantala performs the best for most of the data sets when compared with the other state‐of‐the‐art string sorting algorithms. For the artificial data set, burst sort outperforms both the serial MKQS and the radix sort implemented by Rantala by a margin of 4.99% and 13.17%. MKQS performs very poorly in its serial version, the main reason being its inherent recursive nature.…”
Section: Experimental Results and Performance Analysismentioning
confidence: 99%
“…Thus, in terms of core performance-per-power, little cores are the best with around 800 MIPS/W, big cores come second with 180 MIPS/W and Xeon cores are the worst with 40 MIPS/W. To measure memory bandwidth we use pmbw 0.6.2 (Parallel Memory Bandwidth Benchmark) [7]. Figure 1 plots the memory bandwidth of Xeon and the three ARM configurations, in log-log scale.…”
Section: Benchmark Resultsmentioning
confidence: 99%
“…Quicksort 20 can be seen as a specialization of sample sort with fixed parameter k = 2. Sample sort is very popular for sorting large amounts of items on distributed systems, 30‐32 on GPUs, 33 and also for strings 34,35 …”
Section: Register Sample Sortmentioning
confidence: 99%