A Compact VLSI System for Bio-Inspired Visual Motion Estimation

Shi, Cong; Luo, Gang

doi:10.1109/tcsvt.2016.2630848

Cited by 10 publications

(28 citation statements)

References 47 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The virtual DVS sensor on the FPGA chip was essentially a 16 KB AER data buffer memory and it sent out stored AER event streams in a continuous manner, as a real DVS camera does. With such a virtualized sensor technique, the processing performance of the prototype in real application situations can be fairly measured, as if it was directly interfacing with a real sensor [ 36 , 37 ]. However, the virtual DVS and other components, such as the ARM core, the PC, and the Ethernet interface in Figure 6 , were only used to build the laboratory evaluation environment.…”

Section: Resultsmentioning

confidence: 99%

A High-Speed Low-Cost VLSI System Capable of On-Chip Online Learning for Dynamic Vision Sensor Data Classification

Huang

Wang

et al. 2020

Sensors

Self Cite

View full text Add to dashboard Cite

This paper proposes a high-speed low-cost VLSI system capable of on-chip online learning for classifying address-event representation (AER) streams from dynamic vision sensor (DVS) retina chips. The proposed system executes a lightweight statistic algorithm based on simple binary features extracted from AER streams and a Random Ferns classifier to classify these features. The proposed system’s characteristics of multi-level pipelines and parallel processing circuits achieves a high throughput up to 1 spike event per clock cycle for AER data processing. Thanks to the nature of the lightweight algorithm, our hardware system is realized in a low-cost memory-centric paradigm. In addition, the system is capable of on-chip online learning to flexibly adapt to different in-situ application scenarios. The extra overheads for on-chip learning in terms of time and resource consumption are quite low, as the training procedure of the Random Ferns is quite simple, requiring few auxiliary learning circuits. An FPGA prototype of the proposed VLSI system was implemented with 9.5~96.7% memory consumption and <11% computational and logic resources on a Xilinx Zynq-7045 chip platform. It was running at a clock frequency of 100 MHz and achieved a peak processing throughput up to 100 Meps (Mega events per second), with an estimated power consumption of 690 mW leading to a high energy efficiency of 145 Meps/W or 145 event/μJ. We tested the prototype system on MNIST-DVS, Poker-DVS, and Posture-DVS datasets, and obtained classification accuracies of 77.9%, 99.4% and 99.3%, respectively. Compared to prior works, our VLSI system achieves higher processing speeds, higher computing efficiency, comparable accuracy, and lower resource costs.

show abstract

Section: Resultsmentioning

confidence: 99%

A High-Speed Low-Cost VLSI System Capable of On-Chip Online Learning for Dynamic Vision Sensor Data Classification

Huang

Wang

et al. 2020

Sensors

Self Cite

View full text Add to dashboard Cite

show abstract

“…Spatiotemporally white noise was added to the sequences before the vision condition filtering to simulate external (physical world) noise. Finally, we applied Shi & Luo's ( Shi & Luo, 2018 ) implementation of Grzywacz and Yuille’ motion energy model (see Figure 1 ) to estimate the speed of motion in these sequences from their spatiotemporal frequency components called motion energy. We examined the relationship between spatial frequencies and speed estimation accuracy of the computational model under different simulated vision conditions and at different speeds.…”

Section: Methodsmentioning

confidence: 99%

“…We implemented the widely accepted computational motion perception model ( Adelson & Bergen, 1985 ; Grzywacz & Yuille, 1990 ) with the following modifications (see Figure 1 ). (1) The 2D spatial filters were decomposed to faster 2-stage cascaded 1D filtering ( Etienne-Cummings, Van der Spiegel, & Mueller, 1999 ; Shi & Luo, 2018 ). (2) In the pre-processing stage, we used a DoG filter to embrace a wider spatial frequency band from 0.5 to 36 cpd to facilitate successive processing.…”

Section: Methodsmentioning

confidence: 99%

Without low spatial frequencies, high resolution vision would be detrimental to motion perception

2020

Self Cite

View full text Add to dashboard Cite

A normally sighted person can see a grating of 30 cycles per degree or higher, but spatial frequencies needed for motion perception are much lower than that. It is unknown for natural images with a wide spectrum how all the visible spatial frequencies contribute to motion speed perception. In this work, we studied the effect of spatial frequency content on motion speed estimation for sequences of natural and stochastic pixel images by simulating different visual conditions, including normal vision, low vision (low-pass filtering), and complementary vision (high-pass filtering at the same cutoff frequencies of the corresponding low-vision conditions) conditions. Speed was computed using a biological motion energy-based computational model. In natural sequences, there was no difference in speed estimation error between normal vision and low vision conditions, but it was significantly higher for complementary vision conditions (containing only high-frequency components) at higher speeds. In stochastic sequences that had a flat frequency distribution, the error in normal vision condition was significantly larger compared with low vision conditions at high speeds. On the contrary, such a detrimental effect on speed estimation accuracy was not found for low spatial frequencies. The simulation results were consistent with the motion direction detection task performed by human observers viewing stochastic sequences. Together, these results (i) reiterate the importance of low frequencies in motion perception, and (ii) indicate that high frequencies may be detrimental for speed estimation when low frequency content is weak or not present.

show abstract

“…While traditionally this would have required N S 2 N T times of multiply-and-accumulation (MAC) operations per pixel per frame ( N S and N T are filter sizes along space and time dimensions, respectively), a separable implementation can be much more efficient. Given the fact that the horizontal or vertical components of the optical flow can be computed independently from the separate horizontal or vertical motion energy channels, the 3D spatiotemporal filter can be decomposed into cascaded spatial and temporal filters [14,25]. This way, the horizontal and vertical motion energy feature maps ME X and ME Y for different spatiotemporal tuning frequencies ( f X/Y , f T ) are extracted as:ITfalse(x,y,t;fTfalse)=Ifalse(x,y,tfalse)∗Gaborfalse(t;fTfalse),MEXfalse(x,y,t;fS,fTfalse)=|ITfalse(x,y,t;fTfalse)∗Gaussfalse(yfalse)∗Gaborfalse(x;fXfalse)|2,MEYfalse(x,y,t;fS,fTfalse)=|ITfalse(x,y,t;fTfalse)∗Gaussfalse(xfalse)∗G…”

Section: Proposed Ttc Estimation Algorithmmentioning

confidence: 99%

A Hardware-Friendly Optical Flow-Based Time-to-Collision Estimation Algorithm

Shi

Dong

Pundlik

et al. 2019

Sensors

Self Cite

View full text Add to dashboard Cite

This work proposes a hardware-friendly, dense optical flow-based Time-to-Collision (TTC) estimation algorithm intended to be deployed on smart video sensors for collision avoidance. The algorithm optimized for hardware first extracts biological visual motion features (motion energies), and then utilizes a Random Forests regressor to predict robust and dense optical flow. Finally, TTC is reliably estimated from the divergence of the optical flow field. This algorithm involves only feed-forward data flows with simple pixel-level operations, and hence has inherent parallelism for hardware acceleration. The algorithm offers good scalability, allowing for flexible tradeoffs among estimation accuracy, processing speed and hardware resource. Experimental evaluation shows that the accuracy of the optical flow estimation is improved due to the use of Random Forests compared to existing voting-based approaches. Furthermore, results show that estimated TTC values by the algorithm closely follow the ground truth. The specifics of the hardware design to implement the algorithm on a real-time embedded system are laid out.

show abstract

A Compact VLSI System for Bio-Inspired Visual Motion Estimation

Cited by 10 publications

References 47 publications

A High-Speed Low-Cost VLSI System Capable of On-Chip Online Learning for Dynamic Vision Sensor Data Classification

A High-Speed Low-Cost VLSI System Capable of On-Chip Online Learning for Dynamic Vision Sensor Data Classification

Without low spatial frequencies, high resolution vision would be detrimental to motion perception

A Hardware-Friendly Optical Flow-Based Time-to-Collision Estimation Algorithm

Contact Info

Product

Resources

About