2015
DOI: 10.1007/s11554-015-0519-1
|View full text |Cite
|
Sign up to set email alerts
|

GPU-assisted HEVC intra decoder

Abstract: The added encoding efficiency and visual quality offered by the High Efficiency Video Coding (HEVC) standard is attained at the cost of a significant computational complexity of both the encoder and the decoder. In particular, the considerable amount of intra prediction modes that are now considered by this standard, together with the increased complexity of the adopted block coding tree structures using a larger diversity of transforms imposes demanding computational efforts that can hardly be satisfied by cu… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
9
0

Year Published

2016
2016
2023
2023

Publication Types

Select...
4
3

Relationship

4
3

Authors

Journals

citations
Cited by 12 publications
(9 citation statements)
references
References 19 publications
0
9
0
Order By: Relevance
“…The more detailed parallelization strategies for the IT, MC, IP, and the in-loop filters (i.e. DBF and SAO) have been elaborated in [14], [9], [12], and [45], respectively. : Parallel decoding on the GPU with two independent frames in flight (and hence two cuda streams), assuming that the considered GPU has enough resources to execute multiple kernels concurrently.…”
Section: Parallel Decoding On the Cpu And Gpu Devicesmentioning
confidence: 99%
“…The more detailed parallelization strategies for the IT, MC, IP, and the in-loop filters (i.e. DBF and SAO) have been elaborated in [14], [9], [12], and [45], respectively. : Parallel decoding on the GPU with two independent frames in flight (and hence two cuda streams), assuming that the considered GPU has enough resources to execute multiple kernels concurrently.…”
Section: Parallel Decoding On the Cpu And Gpu Devicesmentioning
confidence: 99%
“…As in [9], the warps of the IT GPU kernel are assigned according to the block partitioning obtained from the bitstream. The new optimizations that were herein introduced for the IT GPU kernel consist of: i) better data packing: the required data is stored in a 2 bytes word per 8×8 block, which includes the block sizes, transform flags, prediction type; and ii) inter predicted blocks: the new IT GPU kernel already supports the inverse transform of inter predicted blocks, which have not been considered for the intra decoder in [9].…”
Section: B Optimization Of the Decoding Procedures For Gpu Executionmentioning
confidence: 99%
“…If the current frame is encoded as I frame, with only intra predicted blocks, then the IP kernel is started right after the IT kernel. As it was proposed in [9], each warp performs the intra prediction of all blocks or sub-blocks in a 8-sample row of the frame. Similarly, the thread block of this kernel consists of 8 warps, which perform a frame row with a height of 64 samples (FR×64), thus accomplishing a wavefront approach for the whole frame.…”
Section: Global Memory Host Memorymentioning
confidence: 99%
“…The majority of those articles are dealing with encoding challenges. Some of them like [5] exploits Graphics Processing Units (GPUs) to accelerate the intra decoding procedure in HEVC decoder. Hardware partial implementations of H.265 in HLS are presented e.g., in [6] and [7] dealing with only part of the standard, which may imply overall challenges in implementing the entire HEVC encoding/decoding in FPGA.…”
mentioning
confidence: 99%