Digital Foveation: An Energy-Aware Machine Vision Framework

Lubana, Ekdeep Singh; Dick, Robert P.

doi:10.1109/tcad.2018.2858340

Cited by 12 publications

(19 citation statements)

References 12 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This model reduces power consumption by 30% for video capturing by optimizing camera clock frequency. Based on the power model proposed in [12], Lubana et al analyzed sensing energy and described the energy model for imaging systems [6]. This work indicated that system energy consumption depends significantly on the transferred resolutions in imaging systems, and thus they optimized energy usage by using a multi-phase capture-andanalysis approach in which low-resolution, wide-area captures are used to guide high-resolution, narrow captures, thus eliminating task-irrelevant image data capture, transfer, and analysis.…”

Section: Related Workmentioning

confidence: 99%

“…A typical imaging pipeline starts with an image sensor that captures and converts the incoming light into electrical signals via a 2-D sensor array, and transfers the signals in the form of data frames to an image signal processor (ISP) and an application processor for digital signal processing and computer vision tasks [6]. Prior work indicates that data transfer, digital signal processing, and computer vision tasks account for more than 90% of the total energy [7], which depends strongly on the amount of data.…”

Section: A Conventional Image Analysis Frameworkmentioning

confidence: 99%

See 1 more Smart Citation

A Reinforcement-Learning-Based Energy-Efficient Framework for Multi-Task Video Analytics Pipeline

Zhao

Dong

Wang

et al. 2022

IEEE Trans. Multimedia

Self Cite

View full text Add to dashboard Cite

Deep-learning-based video processing has yielded transformative results in recent years. However, the video analytics pipeline is energy-intensive due to high data rates and reliance on complex inference algorithms, which limits its adoption in energy-constrained applications. Motivated by the observation of high and variable spatial redundancy and temporal dynamics in video data streams, we design and evaluate an adaptive-resolution optimization framework to minimize the energy use of multitask video analytics pipelines. Instead of heuristically tuning the input data resolution of individual tasks, our framework utilizes deep reinforcement learning to dynamically govern the input resolution and computation of the entire video analytics pipeline. By monitoring the impact of varying resolution on the quality of high-dimensional video analytics features, hence the accuracy of video analytics results, the proposed end-to-end optimization framework learns the best non-myopic policy for dynamically controlling the resolution of input video streams to globally optimize energy efficiency. Governed by reinforcement learning, optical flow is incorporated into the framework to minimize unnecessary spatio-temporal redundancy that leads to re-computation, while preserving accuracy. The proposed framework is applied to video instance segmentation which is one of the most challenging computer vision tasks, and achieves better energy efficiency than all baseline methods of similar accuracy on the YouTube-VIS dataset.

show abstract

Section: Related Workmentioning

confidence: 99%

Section: A Conventional Image Analysis Frameworkmentioning

confidence: 99%

A Reinforcement-Learning-Based Energy-Efficient Framework for Multi-Task Video Analytics Pipeline

Zhao

Dong

Wang

et al. 2022

IEEE Trans. Multimedia

Self Cite

View full text Add to dashboard Cite

show abstract

“…Linear bottlenecks and inverted residual blocks were used as the basic structure in MobileNetV2, aiming to solve the issue of feature degradation during training. Also, as pointed out by Lubana et al, energy consumption significantly depends on the transferred resolutions in imaging systems [27]. Therefore, they proposed that energy consumption can be dramatically reduced if only the task-related information is input to deep models.…”

Section: Computation-efficient Machine Visionmentioning

confidence: 99%

“…(3) Energy consumption of communication interface. The energy consumption of the communication interface comm is a linear function of the number of transferred frame pixels frame [27], as follows:…”

Section: Energy Modelmentioning

confidence: 99%

MemX

Chang

Zhao

Dong

et al. 2021

Proc. ACM Interact. Mob. Wearable Ubiquitous Technol.

Self Cite

View full text Add to dashboard Cite

This work presents MemX: a biologically-inspired attention-aware eyewear system developed with the goal of pursuing the long-awaited vision of a personalized visual Memex. MemX captures human visual attention on the fly, analyzes the salient visual content, and records moments of personal interest in the form of compact video snippets. Accurate attentive scene detection and analysis on resource-constrained platforms is challenging because these tasks are computation and energy intensive. We propose a new temporal visual attention network that unifies human visual attention tracking and salient visual content analysis. Attention tracking focuses computation-intensive video analysis on salient regions, while video analysis makes human attention detection and tracking more accurate. Using the YouTube-VIS dataset and 30 participants, we experimentally show that MemX significantly improves the attention tracking accuracy over the eye-tracking-alone method, while maintaining high system energy efficiency. We have also conducted 11 in-field pilot studies across a range of daily usage scenarios, which demonstrate the feasibility and potential benefits of MemX.

show abstract

“…Thus, the conventional image processing pipeline of video cameras has transformed in the recent years to include some form of object, scene, and/or event analysis mechanism as well [ 1 ]. Strict real-time and minimal power consumption constraints, however, limit the number and the complexity of operations that can be included within the camera modules [ 2 ]. Thus, some pre-processing tasks, such as motion estimation, image segmentation, and trivial object detection tasks have attracted the attention of contemporary researchers [ 3 ].…”

Section: Introductionmentioning

confidence: 99%

High-Level Synthesis of Online K-Means Clustering Hardware for a Real-Time Image Processing Pipeline

Badawi

Bilal

2019

J. Imaging

View full text Add to dashboard Cite

The growing need for smart surveillance solutions requires that modern video capturing devices to be equipped with advance features, such as object detection, scene characterization, and event detection, etc. Image segmentation into various connected regions is a vital pre-processing step in these and other advanced computer vision algorithms. Thus, the inclusion of a hardware accelerator for this task in the conventional image processing pipeline inevitably reduces the workload for more advanced operations downstream. Moreover, design entry by using high-level synthesis tools is gaining popularity for the facilitation of system development under a rapid prototyping paradigm. To address these design requirements, we have developed a hardware accelerator for image segmentation, based on an online K-Means algorithm using a Simulink high-level synthesis tool. The developed hardware uses a standard pixel streaming protocol, and it can be readily inserted into any image processing pipeline as an Intellectual Property (IP) core on a Field Programmable Gate Array (FPGA). Furthermore, the proposed design reduces the hardware complexity of the conventional architectures by employing a weighted instead of a moving average to update the clusters. Experimental evidence has also been provided to demonstrate that the proposed weighted average-based approach yields better results than the conventional moving average on test video sequences. The synthesized hardware has been tested in real-time environment to process Full HD video at 26.5 fps, while the estimated dynamic power consumption is less than 90 mW on the Xilinx Zynq-7000 SOC.

show abstract

Digital Foveation: An Energy-Aware Machine Vision Framework

Cited by 12 publications

References 12 publications

A Reinforcement-Learning-Based Energy-Efficient Framework for Multi-Task Video Analytics Pipeline

A Reinforcement-Learning-Based Energy-Efficient Framework for Multi-Task Video Analytics Pipeline

MemX

High-Level Synthesis of Online K-Means Clustering Hardware for a Real-Time Image Processing Pipeline

Contact Info

Product

Resources

About