Proceedings of the 2020 Sixth Workshop on Programming Models for SIMD/Vector Processing 2020
DOI: 10.1145/3380479.3380481
|View full text |Cite
|
Sign up to set email alerts
|

How to speed Connected Component Labeling up with SIMD RLE algorithms

Abstract: The research in Connected Component Labeling, although old, is still very active and several efficient algorithms for CPUs and GPUs have emerged during the last years and are always improving the performance. This article introduces a new SIMD run-based algorithm for CCL. We show how RLE compression can be SIMDized and used to accelerate scalar run-based CCL algorithms. A benchmark done on Intel, AMD and ARM processors shows that this new algorithm outperforms the State-of-the-Art by an average factor of ×1.7 … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
14
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
2
2
1

Relationship

3
2

Authors

Journals

citations
Cited by 8 publications
(14 citation statements)
references
References 27 publications
0
14
0
Order By: Relevance
“…Literature on CCL algorithms is extensive and has been centered on 2D images. CCL on CPUs has been heavily studied and optimized [14][17][6] [26]. On GPUs, after an early era of iterative algorithms [43][3] [20], a new generation introduced by Komura [23] are now direct; a new way to manage equivalences and reduce memory accesses was introduced by Playne [36] and has become the basis of the fastest CCL algorithms [19] [2].…”
Section: State-of-the-art Of 3d Algorithmsmentioning
confidence: 99%
See 2 more Smart Citations
“…Literature on CCL algorithms is extensive and has been centered on 2D images. CCL on CPUs has been heavily studied and optimized [14][17][6] [26]. On GPUs, after an early era of iterative algorithms [43][3] [20], a new generation introduced by Komura [23] are now direct; a new way to manage equivalences and reduce memory accesses was introduced by Playne [36] and has become the basis of the fastest CCL algorithms [19] [2].…”
Section: State-of-the-art Of 3d Algorithmsmentioning
confidence: 99%
“…Overlapping segments between lines can also be found without ER using a Finite-State Machine (FSM). In the 2D unification [27], each state of the 2D FSM encodes segment configurations between the current and previous lines. Merging two lines involves iterating over both at the same time: a new label is created for each isolated segment, whereas the components of two overlapping segments are merged together.…”
Section: A Finite-state Machine-based Unificationmentioning
confidence: 99%
See 1 more Smart Citation
“…There are already CPU algorithms implementing those ideas: the LSL [29] and derivatives. We re-designed FLSL [18], a variant of LSL for SIMD CPU (SSE, AVX512, Neon), to target GPUs and address their architectural constraints. The crucial part is to first do a segment detection that consists in an RLE encoder and relies on "compress-store" (Figure 2).…”
Section: Full Runs (Flsl)mentioning
confidence: 99%
“…CCL on CPUs has been heavily studied and optimized [15] [16] [17] [18]. Early GPU CCL algorithms were iteratives [19] [20] [21].…”
Section: Introductionmentioning
confidence: 99%