Video Snapshot compressive imaging (SCI) is a promising technique to capture high-speed videos, which transforms the imaging speed from the detector to mask modulating and only needs a single measurement to capture multiple frames. The algorithm to reconstruct high-speed frames from the measurement plays a vital role in SCI. In this paper, we consider the promising reconstruction algorithm framework, namely plug-andplay (PnP), which is flexible to the encoding process comparing with other deep learning networks. One drawback of existing PnP algorithms is that they use a pretrained denoising network as a plugged prior while the training data of the network might be different from the task in real applications. Towards this end, in this work, we propose the online PnP algorithm which can adaptively update the network's parameters within the PnP iteration; this makes the denoising network more applicable to the desired data in the SCI reconstruction. Furthermore, for color video imaging, RGB frames need to be recovered from Bayer pattern or named demosaicing in the camera pipeline. To address this challenge, we design a two-stage reconstruction framework to optimize these two coupled ill-posed problems and introduce a deep demosaicing prior specifically for video demosaicing. Extensive results on both simulation and real datasets verify the superiority of our adaptive deep PnP algorithm. The code to reproduce the results is at https: //github.com/xyvirtualgroup/AdaptivePnP_SCI.
High-throughput imaging is highly desirable in intelligent analysis of computer vision tasks. In conventional design, throughput is limited by the separation between physical image capture and digital post processing. Computational imaging increases throughput by mixing analog and digital processing through the image capture pipeline. Yet, recent advances of computational imaging focus on the “compressive sampling”, this precludes the wide applications in practical tasks. This paper presents a systematic analysis of the next step for computational imaging built on snapshot compressive imaging (SCI) and semantic computer vision (SCV) tasks, which have independently emerged over the past decade as basic computational imaging platforms. SCI is a physical layer process that maximizes information capacity per sample while minimizing system size, power and cost. SCV is an abstraction layer process that analyzes image data as objects and features, rather than simple pixel maps. In current practice, SCI and SCV are independent and sequential. This concatenated pipeline results in the following problems: i) a large amount of resources are spent on task-irrelevant computation and transmission, ii) the sampling and design efficiency of SCI is attenuated, and iii) the final performance of SCV is limited by the reconstruction errors of SCI. Bearing these concerns in mind, this paper takes one step further aiming to bridge the gap between SCI and SCV to take full advantage of both approaches. After reviewing the current status of SCI, we propose a novel joint framework by conducting SCV on raw measurements captured by SCI to select the region of interest, and then perform reconstruction on these regions to speed up processing time. We use our recently built SCI prototype to verify the framework. Preliminary results are presented and the prospects for a joint SCI and SCV regime are discussed. By conducting computer vision tasks in the compressed domain, we envision that a new era of snapshot compressive imaging with limited end-to-end bandwidth is coming.
Masked Autoencoders (MAEs), the state-of-the-art self-supervised neural network architecture in miscellaneous vision tasks, show surprisingly effective potential in reconstructing images distorted by random masking. This paper first introduces an optical implementation of MAEs, employing digital micromirror devices in the optical path to capture partially blocked images. MAEs with multi-scale patches are deployed in the reconstruction procedure. By using an optical-specialized version of the reconstruction network, the system can reconstruct original scenes of high quality. Simulations and experimental measurements showed a significant performance, achieving 24.41 dB average peak-signal-to-noise on Davis2017 datasets and 29.92 dB (masked areas) on authentic captured images under 70% of pixels being blocked. This paves the way for the application of low-bandwidth sampling of high-throughput high-resolution images.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.