2022
DOI: 10.1109/tpds.2022.3154315
|View full text |Cite
|
Sign up to set email alerts
|

OSM: Off-Chip Shared Memory for GPUs

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
1
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 9 publications
(3 citation statements)
references
References 64 publications
0
1
0
Order By: Relevance
“…In AMDGPU, registers overflow data into global memory, and GPU compute units need to go through a second-level cache to access global memory, which reduces access efficiency and presents speed mismatch problems. Future work will address the storage of register overflow data to a relatively fast on-chip memory 16 , such as a L1 cache or LDS memory, to improve the access efficiency after GPU register overflow.…”
Section: Discussionmentioning
confidence: 99%
“…In AMDGPU, registers overflow data into global memory, and GPU compute units need to go through a second-level cache to access global memory, which reduces access efficiency and presents speed mismatch problems. Future work will address the storage of register overflow data to a relatively fast on-chip memory 16 , such as a L1 cache or LDS memory, to improve the access efficiency after GPU register overflow.…”
Section: Discussionmentioning
confidence: 99%
“…When choosing which memory a thread will access, it must be considered which memory spaces are visible to a given thread. As each thread block is contained to a singular SM, each block is allocated shared private L1 data and instruction memory caches [75]. All threads contained within a thread block share both read and write access to the L1 cache, and threads in other blocks will not be able to view this data.…”
Section: Cuda Developer Toolkitmentioning
confidence: 99%
“…[21,22], the DCT-based least-squares unwrapping algorithm is highly parallel and is very suitable for execution on a GPU. Recently, researchers have introduced the concept of off-chip shared memory, which improves the efficiency of communication between processes [23]. Shared memory enables multiple processes to access the same memory space, optimizing thread communication and reducing the data processing time.…”
Section: Introductionmentioning
confidence: 99%