FPGA Based Acceleration for Image Processing Applications

Saldaña-González, Griselda; Arias-Estrada, Miguel

doi:10.5772/7067

Cited by 6 publications

(2 citation statements)

References 12 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Systolic arrays are usually built on spatial architectures like FPGAs and CGRAs, or built as ASIC circuits [2,6,7,14,24,25].…”

Section: Related Workmentioning

confidence: 99%

Systolic Computing on GPUs for Productive Performance

Rong¹,

Hao²,

Liang³

et al. 2020

Preprint

View full text Add to dashboard Cite

We propose a language and compiler to productively build high-performance software systolic arrays that run on GPUs. Based on a rigorous mathematical foundation (uniform recurrence equations and space-time transform), our language has a high abstraction level and covers a wide range of applications. A programmer specifies a projection of a dataflow compute onto a linear systolic array, while leaving the detailed implementation of the projection to a compiler; the compiler implements the specified projection and maps the linear systolic array to the SIMD execution units and vector registers of GPUs. In this way, both productivity and performance are achieved in the same time. This approach neatly combines loop transformations, data shuffling, and vector register allocation into a single framework. Meanwhile, many other optimizations can be applied as well; the compiler composes the optimizations together to generate efficient code.We implemented the approach on Intel GPUs. This is the first system that allows productive construction of systolic arrays on GPUs. We allow multiple projections, arbitrary projection directions and linear schedules, which can express most, if not all, systolic arrays in practice. Experiments with 1-and 2-D convolution on an Intel GEN9.5 GPU have demonstrated the generality of the approach, and its productivity in expressing various systolic designs for finding the best candidate. Although our systolic arrays are purely software running on generic SIMD hardware, compared with the GPU's specialized, hardware samplers that perform the same convolutions, some of our best designs are up to 59% faster. Overall, this approach holds promise for productive high-performance computing on GPUs.

show abstract

“…Systolic arrays are usually built on spatial architectures like FPGAs and CGRAs, or built as ASIC circuits [2,6,7,14,24,25].…”

Section: Related Workmentioning

confidence: 99%

Systolic Computing on GPUs for Productive Performance

Rong¹,

Hao²,

Liang³

et al. 2020

Preprint

View full text Add to dashboard Cite

show abstract

“…In this approach a matched filter by correlation is used, which also determines the object's center of mass each 4,51 ms for small images (256 × 256 pixels). In [10], a system for image filtering and motion estimation using SAD (sum of absolute differences) is implemented using a systolic architecture suitable for estimating motion each 5 ms in images with 640 × 480 pixels. The design and implementation of robust real-time visual servoing control, with an FPGA-based image coprocessor for a rotary inverted pendulum, are presented in [11].…”

Section: Related Workmentioning

confidence: 99%

An FPGA-Based Omnidirectional Vision Sensor for Motion Detection on Mobile Robots

Mori

Arias-García

Sánchez-Ferreira

et al. 2012

International Journal of Reconfigurable Computing

View full text Add to dashboard Cite

This work presents the development of an integrated hardware/software sensor system for moving object detection and distance calculation, based on background subtraction algorithm. The sensor comprises a catadioptric system composed by a camera and a convex mirror that reflects the environment to the camera from all directions, obtaining a panoramic view. The sensor is used as an omnidirectional vision system, allowing for localization and navigation tasks of mobile robots. Several image processing operations such as filtering, segmentation and morphology have been included in the processing architecture. For achieving distance measurement, an algorithm to determine the center of mass of a detected object was implemented. The overall architecture has been mapped onto a commercial low-cost FPGA device, using a hardware/software co-design approach, which comprises a Nios II embedded microprocessor and specific image processing blocks, which have been implemented in hardware. The background subtraction algorithm was also used to calibrate the system, allowing for accurate results. Synthesis results show that the system can achieve a throughput of 26.6 processed frames per second and the performance analysis pointed out that the overall architecture achieves a speedup factor of 13.78 in comparison with a PC-based solution running on the real-time operating system xPC Target.

show abstract

A new FPGA-based real-time configurable system for medical image processing

Chiuchisan

2013

2013 E-Health and Bioengineering Conference (EHB)

View full text Add to dashboard Cite

FPGA Based Acceleration for Image Processing Applications

Cited by 6 publications

References 12 publications

Systolic Computing on GPUs for Productive Performance

Systolic Computing on GPUs for Productive Performance

An FPGA-Based Omnidirectional Vision Sensor for Motion Detection on Mobile Robots

A new FPGA-based real-time configurable system for medical image processing

Contact Info

Product

Resources

About