Huanxin Lin scite author profile

Huanxin Lin

3Publications

3Citation Statements Received

27Citation Statements Given

How they've been cited

How they cite others

Affiliations

Hong Kong University of Science and Technology, University of Hong Kong, Chinese University of Hong Kong

Publications

Order By: Most citations

On-GPU Thread-Data Remapping for Branch Divergence Reduction

Lin

Wang

Liu

2018

ACM Trans. Archit. Code Optim.

View full text Add to dashboard Cite

General Purpose GPU computing (GPGPU) plays an increasingly vital role in high performance computing and other areas like deep learning. However, arising from the SIMD execution model, the branch divergence issue lowers efficiency of conditional branching on GPUs, and hinders the development of GPGPU. To achieve runtime on-the-spot branch divergence reduction, we propose the first on-GPU thread-data remapping scheme. Before kernel launching, our solution inserts codes into GPU kernels immediately before each target branch so as to acquire actual runtime divergence information. GPU software threads can be remapped to datasets multiple times during single kernel execution. We propose two thread-data remapping algorithms that are tailored to the GPU architecture. Effective on two generations of GPUs from both NVIDIA and AMD, our solution achieves speedups up to 2.718 with third-party benchmarks. We also implement three GPGPU frontier benchmarks from areas including computer vision, algorithmic trading and data analytics. They are hindered by more complex divergence coupled with different memory access patterns, and our solution works better than the traditional thread-data remapping scheme in all cases. As a compiler-assisted runtime solution, it can better reduce divergence for divergent applications that gain little acceleration on GPUs for the time being.

show abstract

Efficient low-latency packet processing using On-GPU Thread-Data Remapping

Lin

Wang

2019

Journal of Parallel and Distributed Computing

View full text Add to dashboard Cite

On-GPU thread-data remapping for nested branch divergence

Lin

Wang

2020

Journal of Parallel and Distributed Computing

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Huanxin Lin

On-GPU Thread-Data Remapping for Branch Divergence Reduction

Efficient low-latency packet processing using On-GPU Thread-Data Remapping

On-GPU thread-data remapping for nested branch divergence

Contact Info

Product

Resources

About