Proceedings of the Conference on High Performance Graphics 2009 2009
DOI: 10.1145/1572769.1572795
|View full text |Cite
|
Sign up to set email alerts
|

Efficient stream compaction on wide SIMD many-core architectures

Abstract: Stream compaction is a common parallel primitive used to remove unwanted elements in sparse data. This allows highly parallel algorithms to maintain performance over several processing steps and reduces overall memory usage.For wide SIMD many-core architectures, we present a novel stream compaction algorithm and explore several variations thereof. Our algorithm is designed to maximize concurrent execution, with minimal use of synchronization. Bandwidth and auxiliary storage requirements are reduced significant… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
68
0
1

Year Published

2010
2010
2021
2021

Publication Types

Select...
5
3

Relationship

1
7

Authors

Journals

citations
Cited by 88 publications
(69 citation statements)
references
References 17 publications
0
68
0
1
Order By: Relevance
“…It computes the offset for a selected element by using: section offset+ block offset + warp offset + intra warp offset. The intra warp offset can be determined by using the intra-warp voting functions ballot() and popc() [2][5], while the warp offset can be computed by using the binary-bit scan operations [5]. It uses the function syncthreads count() to get the total number of wanted elements within the block and then writes the number to an intermediate block-count array.…”
Section: Prefix-sum Based Approachesmentioning
confidence: 99%
“…It computes the offset for a selected element by using: section offset+ block offset + warp offset + intra warp offset. The intra warp offset can be determined by using the intra-warp voting functions ballot() and popc() [2][5], while the warp offset can be computed by using the binary-bit scan operations [5]. It uses the function syncthreads count() to get the total number of wanted elements within the block and then writes the number to an intermediate block-count array.…”
Section: Prefix-sum Based Approachesmentioning
confidence: 99%
“…A parallel prefix sum over this array will generate the list of indirections from which the parent nodes can be updated in parallel. Finally a stream compaction pass generates the new list of unique nodes [Billeter et al 2009]. …”
Section: Reducing a Sparse Voxel Tree To A Dagmentioning
confidence: 99%
“…Most notably, algorithms to efficiently implement workload balancing using a compactation step were introduced in the context of KDtrees [15], Reyes-style subdivision [8] and bounding volume hierarchies construction [4]. Generalized stream compaction was presented by Billeter et al [1]. In the context of tessellating parametric surfaces, scan operations were used in order to scatter dynamically generated vertices to a VBO [12].…”
Section: Previous Workmentioning
confidence: 99%
“…Maybe applying the sharedmemory aware compaction model presented by Billeter et al [1] could further improve performance.…”
Section: Derivationmentioning
confidence: 99%
See 1 more Smart Citation