EBBIOT: A Low-complexity Tracking Algorithm for Surveillance in IoVT using Stationary Neuromorphic Vision Sensors

Acharya, Jyotibdha; Caycedo, Andres Ussa; Padala, Vandana Reddy; Sidhu, Rishi Raj Singh; Orchard, Garrick; Ramesh, Bharath; Basu, Arindam

doi:10.1109/socc46988.2019.1570553690

Cited by 25 publications

(28 citation statements)

References 12 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For instance, in [21], the authors demonstrated effective tracking using a correlation filter on top of a CNN structure. In the work of EBBIOT [1], a vehicle tracking system based on event sensors was demonstrated. Using an adaptive time surface formulation of events, Chen et al [8] have demonstrated multiple object tracking in a controlled environment.…”

Section: Multi-object Tracking Using Dvsmentioning

confidence: 99%

Remot

Gao

Wang

2022

Proceedings of the 2022 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays

View full text Add to dashboard Cite

In contrast to conventional vision sensors that produce images of the entire field-of-view at a fixed frame rate, dynamic vision sensors (DVS) are neuromorphic devices that only produce sparse events in response to changes in light intensity local to each pixel, making them promising technologies for use in demanding edge scenarios where energy-efficient intelligent computations are needed. While several early research have demonstrated promising results in performing high-level machine vision tasks using vision events only, these algorithms are often too complex for real-time deployments in edge systems with limited processing and storage capabilities. In this work, a novel hardware-software architecture, called REMOT, is proposed to leverage the unique properties of DVS to perform real-time multi-object tracking (MOT) on FPGAs. REMOT incorporates a parallel set of reconfigurable hardware attention units (AUs) that work in tandem with a modular attention-guided software framework running in the attached processor. Each hardware AU autonomously adjusts its region of attention by processing each vision event as they are produced by the DVS. Using information aggregated by the AUs, high-level analyses are performed in software. To demonstrate the flexibility and modularity of REMOT, a family of MOT algorithms with different hardware-software configurations and tradeoffs have been implemented on 2 different edge reconfigurable systems. Experimental results show that RE-MOT is capable of processing 0.43-2.22 million events per second at 1.75-5.68 watts, making them suitable for real-time operations while maintaining good MOT accuracy in our target datasets. When compared with a software-only implementation using the same edge platforms, our HW-SW implementation results in up to 33.6 times higher event processing throughput and 25.9 times higher power efficiency. CCS CONCEPTS• Computer systems organization → Real-time system architecture; Reconfigurable computing; • Computing methodologies → Tracking.

show abstract

Section: Multi-object Tracking Using Dvsmentioning

confidence: 99%

Remot

Gao

Wang

2022

Proceedings of the 2022 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays

View full text Add to dashboard Cite

show abstract

“…To evaluate the robustness of the proposed NOMF across temperature, we have performed 8000 point Monte Carlo (MC) simulations of a 3 × 3 image patch initialized at five random discrete 5"1"s 4"0"s patterns (inset of Fig. 9(a)) chosen randomly out of 9 4 possible patterns. Fig.…”

Section: E Effect Of Temperature Variationsmentioning

confidence: 99%

“…A comparison of the proposed in-memory computing-based NOMF with other event and frame based denoising techniques are shown in Table I for processing a W × H image. The event-based nearest neighbour filter (NN-filt) [32] stores the timestamp of an incoming event using β t (β t =16) bit per timestamp [9]. Further, it marks the event as valid if the difference of timestamps in an n × n spatial neighbourhood is less than a specified threshold.…”

Section: F Performancementioning

confidence: 99%

“…However, in real scenarios, many spurious events are generated due to noise which requires the use of noise filtering or denoise operations [7], [8]. While event-driven filtering [7], [8] was proposed initially, hybrid frame-event approach [9] is more suited for IoT operations. Since the event generated in the NVS will not reset until it is readout, they propose to sample the sensor memory in a burst after a regular interval t f , to create an event-based binary image (EBBI) frame.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

A 51.3 TOPS/W, 134.4 GOPS In-memory Binary Image Filtering in 65nm CMOS

Bose,

Singla,

Basu

2021

Preprint

Self Cite

View full text Add to dashboard Cite

Neuromorphic vision sensors (NVS) can enable energy savings due to their event-driven that exploits the temporal redundancy in video streams from a stationary camera. However, noise-driven events lead to the false triggering of the object recognition processor. Image denoise operations require memoryintensive processing leading to a bottleneck in energy and latency. In this paper, we present in-memory filtering (IMF), a 6T-SRAM in-memory computing based image denoising for event-based binary image (EBBI) frame from an NVS. We propose a non-overlap median filter (NOMF) for image denoising. An in-memory computing framework enables hardware implementation of NOMF leveraging the inherent read disturb phenomenon of 6T-SRAM. To demonstrate the energy-saving and effectiveness of the algorithm, we fabricated the proposed architecture in a 65nm CMOS process. As compared to fully digital implementation, IMF enables > 70× energy savings and a > 3× improvement of processing time when tested with the video recordings from a DAVIS sensor and achieves a peak throughput of 134.4 GOPS. Furthermore, the peak energy efficiencies of the NOMF is 51.3 TOPS/W, comparable with state of the art inmemory processors. We also show that the accuracy of the images obtained by NOMF provide comparable accuracy in tracking and classification applications when compared with images obtained by conventional median filtering.

show abstract

“…The most straight-forward method is to accumulate events over a fixed time interval. For instance, [3,16,31] used contrived time intervals of 5−66ms to estimate binary event images for object tracking and stereo vision. Nevertheless, the motiondependent sensing aspect of the DVS is a major hindrance for choosing the optimal time interval to prevent motion blur or a low density event image.…”

Section: Related Workmentioning

confidence: 99%

Superevents: Towards Native Semantic Segmentation for Event-based Cameras

Low¹,

Sonthalia²,

Gao³

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

Most successful computer vision models transform low-level features, such as Gabor filter responses, into richer representations of intermediate or mid-level complexity for downstream visual tasks. These mid-level representations have not been explored for event cameras, although it is especially relevant to the visually sparse and often disjoint spatial information in the event stream. By making use of locally consistent intermediate representations, termed as superevents, numerous visual tasks ranging from semantic segmentation, visual tracking, depth estimation shall benefit. In essence, superevents are perceptually consistent local units that delineate parts of an object in a scene. Inspired by recent deep learning architectures, we present a novel method that employs lifetime augmentation for obtaining an event stream representation that is fed to a fully convolutional network to extract superevents. Our qualitative and quantitative experimental results on several sequences of a benchmark dataset highlights the significant potential for event-based downstream applications. CCS CONCEPTS• Computing methodologies → Structured outputs; Image segmentation.

show abstract

EBBIOT: A Low-complexity Tracking Algorithm for Surveillance in IoVT using Stationary Neuromorphic Vision Sensors

Cited by 25 publications

References 12 publications

Remot

Remot

A 51.3 TOPS/W, 134.4 GOPS In-memory Binary Image Filtering in 65nm CMOS

Superevents: Towards Native Semantic Segmentation for Event-based Cameras

Contact Info

Product

Resources

About