Sign Language Recognition (SLR) targets on interpreting the sign language into text or speech, so as to facilitate the communication between deaf-mute people and ordinary people. This task has broad social impact, but is still very challenging due to the complexity and large variations in hand actions. Existing methods for SLR use hand-crafted features to describe sign language motion and build classification models based on those features. However, it is difficult to design reliable features to adapt to the large variations of hand gestures. To approach this problem, we propose a novel 3D convolutional neural network (CNN) which extracts discriminative spatial-temporal features from raw video stream automatically without any prior knowledge, avoiding designing features. To boost the performance, multi-channels of video streams, including color information, depth clue, and body joint positions, are used as input to the 3D CNN in order to integrate color, depth and trajectory information. We validate the proposed model on a real dataset collected with Microsoft Kinect and demonstrate its effectiveness over the traditional approaches based on hand-crafted features.
Hand posture recognition (HPR) is quite a challenging task, due to both the difficulty in detecting and tracking hands with normal cameras and the limitations of traditional manually selected features. In this article, we propose a two-stage HPR system for Sign Language Recognition using a Kinect sensor. In the first stage, we propose an effective algorithm to implement hand detection and tracking. The algorithm incorporates both color and depth information, without specific requirements on uniform-colored or stable background. It can handle the situations in which hands are very close to other parts of the body or hands are not the nearest objects to the camera and allows for occlusion of hands caused by faces or other hands. In the second stage, we apply deep neural networks (DNNs) to automatically learn features from hand posture images that are insensitive to movement, scaling, and rotation. Experiments verify that the proposed system works quickly and accurately and achieves a recognition accuracy as high as 98.12%. . 2015. A real-time hand posture recognition system using deep neural networks.
This paper reviews the first challenge on efficient perceptual image enhancement with the focus on deploying deep learning models on smartphones. The challenge consisted of two tracks. In the first one, participants were solving the classical image super-resolution problem with a bicubic downscaling factor of 4. The second track was aimed at real-world photo enhancement, and the goal was to map low-quality photos from the iPhone 3GS device to the same photos captured with a DSLR camera. The target metric used in this challenge combined the runtime, PSNR scores and solutions' perceptual results measured in the user study. To ensure the efficiency of the submitted models, we additionally measured their runtime and memory requirements on Android smartphones. The proposed solutions significantly improved baseline results defining the state-of-the-art for image enhancement on smartphones. * A. Ignatov and R. Timofte ({andrey,radu.timofte}@vision.ee.ethz.ch, ETH Zurich) are the challenge organizers, while the other authors participated in the challenge. The Appendix contains the authors' teams and affiliations. PIRM 2018 Challenge webpage: http://ai-benchmark.org
Remote sensing images are often polluted by stripe noise, which leads to negative impact on visual performance. Thus, it is necessary to remove stripe noise for the subsequent applications, e.g., classification and target recognition. This paper commits to remove the stripe noise to enhance the visual quality of images, while preserving image details of stripe-free regions. Instead of solving the underlying image by variety of algorithms, we first estimate the stripe noise from the degraded images, then compute the final destriping image by the difference of the known stripe image and the estimated stripe noise. In this paper, we propose a non-convex 0 sparse model for remote sensing image destriping by taking full consideration of the intrinsically directional and structural priors of stripe noise, and the locally continuous property of the underlying image as well. Moreover, the proposed non-convex model is solved by a proximal alternating direction method of multipliers (PADMM) based algorithm. In addition, we also give the corresponding theoretical analysis of the proposed algorithm. Extensive experimental results on simulated and real data demonstrate that the proposed method outperforms recent competitive destriping methods, both visually and quantitatively.Keywords: non-convex 0 sparse model; PADMM based algorithm; mathematical program with equilibrium constraints (MPEC); stripe noise removal
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.