We present a novel method for automatic fingerspelling recognition which is able to discriminate complex hand configurations with high amounts of finger occlusions. Such a scenario, while common in most fingerspelling alphabets, presents a challenge for vision methods due to the low intensity variation along important shape edges in the hand image. Our approach is based on a simple and cheap modification of the capture setup: a multi-flash camera is used with flashes strategically positioned to cast shadows along depth discontinuities in the scene, allowing efficient and accurate hand shape extraction. We then use a shift and scale invariant shape descriptor for fingerspelling recognition, demonstrating great improvement over methods that rely on features acquired by traditional edge detection and segmentation algorithms.
Visual saliency detection is a useful technique for predicting, which regions humans will tend to gaze upon in any given image. Over the last several decades, numerous algorithms for automatic saliency detection have been proposed and shown to work well on both synthetic and natural images. However, two key challenges remain largely unaddressed: 1) How to improve the relatively low predictive performance for images that contain large objects and 2) how to perform saliency detection on a wider variety of images from various categories without training. In this paper, we propose a new saliency detection algorithm that addresses these challenges. Our model first detects potentially salient regions based on multiscale extrema of local perceived color differences measured in the CIELAB color space. These extrema are highly effective for estimating the locations, sizes, and saliency levels of candidate regions. The local saliency candidates are further refined via two global extrema-based features, and then a Gaussian mixture is used to generate the final saliency map. Experimental validation on the extensive CAT2000 data set demonstrates that our proposed method either outperforms or is highly competitive with prior approaches, and can perform well across different categories and object sizes, while remaining training-free.
SUMMARYIn this study, the spatiotemporal characteristics of human speed perception are measured in the spatiotemporal frequency domain through a psychophysical experiment. Based on the experimental results, the mechanism for speed perception is investigated. The existence of a speed-tuned mechanism and the role of the temporal frequency in speed perception are demonstrated. The method of speed comparison by simultaneous presentation is used in the experiment. Two moving gratings are used as the standard and the comparison stimuli, and the "stimulation of subjectively equal speed" (SSES) is determined for the standard stimulus in spatiotemporal frequency coordinates. As experimental results, two different spatiotemporal characteristics are shown. For the speed range lower than 3.2 deg/s, the speed perception depends only on the physical speed of the stimulus, which suggests the existence of a speed-tuned mechanism. In the range higher than that value, the speed perception depends on the physical speed and the temporal frequency of the stimulus, indicating that both a speedtuned mechanism and a temporal frequency-tuned mechanism are involved. Based on these experimental results, the human mechanism in speed perception is described by using a speed-tuned mechanism and a temporal frequencytuned mechanism. A basic framework for a speed perception model is proposed.
In this paper, we propose a method for detecting vehicles from a nighttime driving scene taken by an in-vehicle monocular camera. Since it is difficult to recognize the shape of the vehicles during nighttime, vehicle detection is based on the headlights and the taillights, which are bright areas of pixels called blobs. Many research studies using automatic multilevel thresholding are being conducted, but these methods are prone to get affected by the ambient light because it uses the luminance of the whole image to derive the thresholds. Owing to such reasons, we focused on the Laplacian of Gaussian operator, which derives the response of luminance difference between the blob and its surroundings. Compared with automatic multilevel thresholding, Laplacian of Gaussian operator is more robust to the ambient light. However, the computational cost to derive the response of this operator is large. Therefore, we used a method called Center Surround Extremas to detect the blobs in high speed. Since the detected blobs include nuisance lights, we had to determine whether the blob is a light of a vehicle or not. Thus, we classified them according to the features of the blob using support vector machines. Then, we detected vehicle traffic lane and specified the region where the vehicle may exist. Finally, we classified the blobs based on the movements across the frames. We applied the proposed method to nighttime driving sequences and confirmed the effectiveness of the classification process used in this method and that it could process within frame rate.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.