2015
DOI: 10.14778/2809974.2809988
|View full text |Cite
|
Sign up to set email alerts
|

SIMD- and cache-friendly algorithm for sorting an array of structures

Abstract: This paper describes our new algorithm for sorting an array of structures by efficiently exploiting the SIMD instructions and cache memory of today's processors. Recently, multiway mergesort implemented with SIMD instructions has been used as a high-performance in-memory sorting algorithm for sorting integer values. For sorting an array of structures with SIMD instructions, a frequently used approach is to first pack the key and index for each record into an integer value, sort the key-index pairs using SIMD i… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
27
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
4
4

Relationship

0
8

Authors

Journals

citations
Cited by 44 publications
(27 citation statements)
references
References 20 publications
0
27
0
Order By: Relevance
“…Merging two sorted arrays using traditional comparison instructions is sub-optimal: The aggressive out-of-order cores are not able to predict the direction of the merge branch (i.e., which of the two arrays will give the next element). Recent projects [26,43] show how to use SIMD instructions for efficient merging. Using 128-bit instructions, we can create a bitonic merge network that merges 8 elements at a time.…”
Section: Using Libmctop In Parallel Mergesortmentioning
confidence: 99%
“…Merging two sorted arrays using traditional comparison instructions is sub-optimal: The aggressive out-of-order cores are not able to predict the direction of the merge branch (i.e., which of the two arrays will give the next element). Recent projects [26,43] show how to use SIMD instructions for efficient merging. Using 128-bit instructions, we can create a bitonic merge network that merges 8 elements at a time.…”
Section: Using Libmctop In Parallel Mergesortmentioning
confidence: 99%
“…SIGMOD'17, May [14][15][16][17][18][19]2017 trends [21,6,23,1,3,33,5,22,30,36]. The availability of low-cost memory, for instance, has given rise to the wide adoption of in-memory databases [35,26,24,8].…”
Section: Introductionmentioning
confidence: 99%
“…Moreover, sorting can speed up duplicate removal, ranking, and grouping operations [13]. Therefore, a lot of research has been devoted to identifying efficient sorting algorithms that utilise modern hardware features and scale well across multiple cores, processors, and even nodes [21,6,35,40,24,33,22,8]. After having recently achieved sorting rates of over one billion keys per second [28], Graphics Processing Units (GPUs), featuring thousands of cores and a memory bandwidth of several hundred gigabytes per second, emerged as a promising platform to accelerate sorting.…”
Section: Introductionmentioning
confidence: 99%
“…Sorting is one of the most fundamental computation kernels in data management, and lots of approaches to accelerate the kernel have been proposed [1]- [8]. These approaches offer significant results, but mostly these studies utilize SIMD instructions of Intel processors [1], [7], [8] to exploit datalevel parallelism or experiment on rich hardware environments such as supercomputers [5] or clusters [7].…”
Section: Introductionmentioning
confidence: 99%
“…These approaches offer significant results, but mostly these studies utilize SIMD instructions of Intel processors [1], [7], [8] to exploit datalevel parallelism or experiment on rich hardware environments such as supercomputers [5] or clusters [7]. It is unclear that these approaches are available on low computational performance machines like embedded systems.…”
Section: Introductionmentioning
confidence: 99%