We propose an automatic video inpainting algorithm which relies on the optimisation of a global, patch-based functional. Our algorithm is able to deal with a variety of challenging situations which naturally arise in video inpainting, such as the correct reconstruction of dynamic textures, multiple moving objects and moving background. Furthermore, we achieve this in an order of magnitude less execution time with respect to the state-of-the-art. We are also able to achieve good quality results on high definition videos. Finally, we provide specific algorithmic details to make implementation of our algorithm as easy as possible. The resulting algorithm requires no segmentation or manual input other than the definition of the inpainting mask, and can deal with a wider variety of situations than is handled by previous work. 1. Introduction. Advanced image and video editing techniques are increasingly common in the image processing and computer vision world, and are also starting to be used in media entertainment. One common and difficult task closely linked to the world of video editing is image and video " inpainting ". Generally speaking, this is the task of replacing the content of an image or video with some other content which is visually pleasing. This subject has been extensively studied in the case of images, to such an extent that commercial image inpainting products destined for the general public are available, such as Photoshop's " Content Aware fill " [1]. However, while some impressive results have been obtained in the case of videos, the subject has been studied far less extensively than image inpainting. This relative lack of research can largely be attributed to high time complexity due to the added temporal dimension. Indeed, it has only very recently become possible to produce good quality inpainting results on high definition videos, and this only in a semi-automatic manner. Nevertheless, high-quality video inpainting has many important and useful applications such as film restoration, professional post-production in cinema and video editing for personal use. For this reason, we believe that an automatic, generic video inpainting algorithm would be extremely useful for both academic and professional communities
Even though vanishing points in digital images result from parallel lines in the 3D scene, most of the proposed detection algorithms are forced to rely heavily either on additional properties (like orthogonality or coplanarity and equal distance) of the underlying 3D lines, or on knowledge of the camera calibration parameters, in order to avoid spurious responses. In this work, we develop a new detection algorithm that relies on the Helmoltz principle recently proposed for computer vision by Desolneux et al. [8], [9], both at the line detection and line grouping stages. This leads to a vanishing point detector with a low false alarms rate and a high precision level, which does not rely on any a priori information on the image or calibration parameters, and does not require any parameter tuning.
We propose in this paper a total variation based restoration model which incorporates the image acquisition model z = h * u + n (h denotes the blurring kernel and n a white Gaussian noise) as a set of local constraints. These constraints, one for each pixel of the image, express the fact that the variance of the noise can be estimated from the residuals h * u − z if we use a neighborhood of each pixel. This is motivated by the fact that the usual inclusion of the image acquisition model as a single constraint expressing a bound for the variance of the noise does not give satisfactory results if we wish to simultaneously recover textured regions and obtain a good denoising of the image. We use Uzawa's algorithm to minimize the total variation subject to the proposed family of local constraints and we display some experiments using this model.
Image inpainting is the process of filling in missing regions in an image in a plausible way. In this contribution, we propose and describe an implementation of a patch-based image inpainting algorithm. The method is actually a two-dimensional version of our video inpainting algorithm The functional specifies that a good solution to the inpainting problem should be an image where each patch is very similar to its nearest neighbor in the unoccluded area. Iterations are performed in a multi-scale framework which yields globally coherent results. In this manner two of the major goals of image inpainting, the correct reconstruction of textures and structures, are addressed. We address a series of important practical issues which arise when using such an approach. In particular, we reduce execution times by using the PatchMatch [C. Barnes, PatchMatch: a randomized correspondence algorithm for structural image editing, ACM Transactions on Graphics, (2009)] algorithm for nearest neighbor searches, and we propose a modified patch distance which improves the comparison of textured patches. We address the crucial issue of initialization and the choice of the number of pyramid levels, two points which are rarely discussed in such approaches. We provide several examples which illustrate the advantages of our algorithm, and compare our results with those of state-of-the-art methods. Source CodeThe reviewed source code and documentation for this algorithm are available from the web page of this article 1 .
Depth estimation is of critical interest for scene understanding and accurate 3D reconstruction. Most recent approaches in depth estimation with deep learning exploit geometrical structures of standard sharp images to predict corresponding depth maps. However, cameras can also produce images with defocus blur depending on the depth of the objects and camera settings. Hence, these features may represent an important hint for learning to predict depth. In this paper, we propose a full system for single-image depth prediction in the wild using depth-fromdefocus and neural networks. We carry out thorough experiments to test deep convolutional networks on real and simulated defocused images using a realistic model of blur variation with respect to depth. We also investigate the influence of blur on depth prediction observing model uncertainty with a Bayesian neural network approach. From these studies, we show that out-of-focus blur greatly improves the depth-prediction network performances. Furthermore, we transfer the ability learned on a synthetic, indoor dataset to real, indoor and outdoor images. For this purpose, we present a new dataset containing real all-focus and defocused images from a Digital Single-Lens Reflex (DSLR) camera, paired with ground truth depth maps obtained with an active 3D sensor for indoor scenes. The proposed approach is successfully validated on both this new dataset and standard ones as NYUv2 or Depth-in-the-Wild. Code and new datasets are available at https:// github.com/ marcelampc/ d3net depth estimation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.