Recognizing freehand sketches with high arbitrariness is such a great challenge that the automatic recognition rate has reached a ceiling in recent years. In this paper, we explicitly explore the shape properties of sketches, which has almost been neglected before in the context of deep learning, and propose a sequential dual learning strategy that combines both shape and texture features. We devise a two-stage recurrent neural network to balance these two types of features. Our architecture also considers stroke orders of sketches to reduce the intra-class variations of input features. Extensive experiments on the TU-Berlin benchmark set show that our method achieves over 90% recognition rate for the first time on this task, outperforming both humans and state-of-the-art algorithms by over 19 and 7.5 percentage points, respectively. Especially, our approach can distinguish the sketches with similar textures but different shapes more effectively than recent deep networks. Based on the proposed method, we develop an on-line sketch retrieval and imitation application to teach children or adults to draw. The application is available as Sketch.Draw. CCS CONCEPTS • Computing methodologies → Object recognition.
Fine-Grained Sketch-Based Image Retrieval (FG-SBIR) aims at finding a specific image from a large gallery given a query sketch. Despite the widespread applicability of FG-SBIR in many critical domains (e.g., crime activity tracking), existing approaches still suffer from a low accuracy while being sensitive to external noises such as unnecessary strokes in the sketch. The retrieval performance will further deteriorate under a more practical on-the-fly setting, where only a partially complete sketch with only a few (noisy) strokes are available to retrieve corresponding images. We propose a novel framework that leverages a uniquely designed deep reinforcement learning model that performs a dual-level exploration to deal with partial sketch training and attention region selection. By enforcing the model's attention on the important regions of the original sketches, it remains robust to unnecessary stroke noises and improve the retrieval accuracy by a large margin. To sufficiently explore partial sketches and locate the important regions to attend, the model performs bootstrapped policy gradient for global exploration while adjusting a standard deviation term that governs a locator network for local exploration. The training process is guided by a hybrid loss that integrates a reinforcement loss and a supervised loss. A dynamic ranking reward is developed to fit the on-the-fly image retrieval process using partial sketches. The extensive experimentation performed on three public datasets shows that our proposed approach achieves the state-of-the-art performance on partial sketch based image retrieval.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.