Visual information varying over time is typically captured by cameras that acquire data via images (frames) equally spaced in time. Using a different approach, Neuromorphic Vision Sensors (NVSs) are emerging visual capturing devices that only acquire information when changes occur in the scene. This results in major advantages in terms of low power consumption, wide dynamic range, high temporal resolution, and lower data rates than conventional video. Although the acquisition strategy already results in much lower data rates than conventional video, such data can be further compressed. To this end, in this paper we propose a lossless compression strategy based on point cloud compression, inspired by the observation that, by appropriately reporting NVS data in a (x, y, t) tridimensional space, we have a point cloud representation of NVS data. The proposed strategy outperforms the benchmark strategies resulting in a compression ratio up to 30% higher for the considered dataset.INDEX TERMS Neuromorphic vision sensor (NVS), neuromorphic spike events, point cloud compression, geometric point cloud compression (GPCC), silicon retinas, spike encoding, data compression.
The rapid development of virtual reality applications continues to urge better compression of 360 • videos owing to the large volume of content. These videos are typically converted to 2-D formats using various projection techniques in order to benefit from ad-hoc coding tools designed to support conventional 2-D video compression. Although recently emerged video coding standard, Versatile Video Coding (VVC) introduces 360 • video specific coding tools, it fails to prioritize the user observed regions in 360 • videos, represented by the rectilinear images called the viewports. This leads to the encoding of redundant regions in the video frames, escalating the bit rate cost of the videos. In response to this issue, this paper proposes a novel 360 • video coding framework for VVC which exploits user observed viewport information to alleviate pixel redundancy in 360 • videos. In this regard, bidirectional optical flow, Gaussian filter and Spherical Convolutional Neural Networks (Spherical CNN) are deployed to extract perceptual features and predict user observed viewports. By appropriately fusing the predicted viewports on the 2-D projected 360 • video frames, a novel Regions of Interest (ROI) aware weightmap is developed which can be used to mask the source video and introduce adaptive changes to the Lagrange and quantization parameters in VVC. Comprehensive experiments conducted in the context of VVC Test Model (VTM) 7.0 show that the proposed framework can improve bitrate reduction, achieving an average bitrate saving of 5.85% and up to 17.15% at the same perceptual quality which is measured using Viewport Peak Signal-To-Noise Ratio (VPSNR).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.