2018
DOI: 10.3390/app8081235
|View full text |Cite
|
Sign up to set email alerts
|

Taxonomy of Vectorization Patterns of Programming for FIR Image Filters Using Kernel Subsampling and New One

Abstract: This study examines vectorized programming for finite impulse response image filtering. Finite impulse response image filtering occupies a fundamental place in image processing, and has several approximated acceleration algorithms. However, no sophisticated method of acceleration exists for parameter adaptive filters or any other complex filter. For this case, simple subsampling with code optimization is a unique solution. Under the current Moore’s law, increases in central processing unit frequency have stopp… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
18
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
7
1
1

Relationship

2
7

Authors

Journals

citations
Cited by 23 publications
(18 citation statements)
references
References 40 publications
0
18
0
Order By: Relevance
“…For referring LUTs, the set or gather SIMD instructions were employed. The outermost loop was parallelized by multi-core threading, and we had pixel-loop vectorization [50]. This implementation was found to be the most effective [50].…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…For referring LUTs, the set or gather SIMD instructions were employed. The outermost loop was parallelized by multi-core threading, and we had pixel-loop vectorization [50]. This implementation was found to be the most effective [50].…”
Section: Resultsmentioning
confidence: 99%
“…The outermost loop was parallelized by multi-core threading, and we had pixel-loop vectorization [50]. This implementation was found to be the most effective [50]. Notably, a vectorized exponential operation is not implemented in these CPUs.…”
Section: Resultsmentioning
confidence: 99%
“…In addition, the search window and kernel sizes are closely related not only to image characteristics, but also to the calculation time (i.e., time resolution) [ 46 , 47 ]. Figure 6 shows the time resolution results of various search window and kernel sizes.…”
Section: Discussionmentioning
confidence: 99%
“…In [18], the Harris operator is optimized using a number of optimizations such as vectorization, data interleaving and parallelization, on both x86/x64 and Arm processors. In [19], different ways of vectorizing the 3D convolution are shown. In [20], a white paper for a Gaussian Blur implementation on Intel processors is proposed, using FP computations.…”
Section: Related Workmentioning
confidence: 99%