We present a new end-to-end generative adversarial network (GAN) for single image motion deblurring, named DeblurGAN-v2, which considerably boosts state-of-the-art deblurring efficiency, quality, and flexibility. DeblurGAN-v2 is based on a relativistic conditional GAN with a doublescale discriminator. For the first time, we introduce the Feature Pyramid Network into deblurring, as a core building block in the generator of DeblurGAN-v2. It can flexibly work with a wide range of backbones, to navigate the balance between performance and efficiency. The plugin of sophisticated backbones (e.g., Inception-ResNet-v2) can lead to solid state-of-the-art deblurring. Meanwhile, with light-weight backbones (e.g., MobileNet and its variants), DeblurGAN-v2 reaches 10-100 times faster than the nearest competitors, while maintaining close to state-ofthe-art results, implying the option of real-time video deblurring. We demonstrate that DeblurGAN-v2 obtains very competitive performance on several popular benchmarks, in terms of deblurring quality (both objective and subjective), as well as efficiency. Besides, we show the architecture to be effective for general image restoration tasks too. Our codes, models and data are available at: https: //github.com/KupynOrest/DeblurGANv2.
Almost all existing visual saliency models focus on predicting a universal saliency map across all observers. Yet psychology studies suggest that visual attention of different observers can vary a lot under some circumstances. In this paper, we set out to study this heterogenous visual attention pattern between different observers and build the first dataset for personalized saliency detection. Further, we propose to decompose a personalized saliency map (PSM) into a universal saliency map (USM) which can be predicted by any existing saliency detection models and a discrepancy between them. Then personalized saliency detection is casted as the task of discrepancy estimation between PSM and USM. To tackle this task we propose two solutions: i) The discrepancy estimation for different observers are casted as different but related tasks. Then we feed the image and its USM into a multi-task convolutional neural network framework to estimate the discrepancy between PSM and USM for each observer; ii) As the discrepancy is related to both image contents and the observers' person-specific information, we feed the image and its associated USM into a convolutional neural network with person-specific information encoded filters to estimate the discrepancy. Extensive experimental results demonstrate the effectiveness of our models for PSM prediction as well their generalization capability for unseen observers.
Saliency detection is a long standing problem in computer vision. Tremendous efforts have been focused on exploring a universal saliency model across users despite their differences in gender, race, age, etc. Yet recent psychology studies suggest that saliency is highly specific than universal: individuals exhibit heterogeneous gaze patterns when viewing an identical scene containing multiple salient objects. In this paper, we first show that such heterogeneity is common and critical for reliable saliency prediction. Our study also produces the first database of personalized saliency maps (PSMs). We model PSM based on universal saliency map (USM) shared by different participants and adopt a multitask CNN framework to estimate the discrepancy between PSM and USM. Comprehensive experiments demonstrate that our new PSM model and prediction scheme are effective and reliable.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.