We report about the design and test of an image processing algorithm for the localization of the optic disk (OD) in low-resolution (about 20 micro/pixel) color fundus images. The design relies on the combination of two procedures: 1) a Hausdorff-based template matching technique on edge map, guided by 2) a pyramidal decomposition for large scale object tracking. The two approaches are tested against a database of 40 images of various visual quality and retinal pigmentation, as well as of normal and small pupils. An average error of 7% on OD center positioning is reached with no false detection. In addition, a confidence level is associated to the final detection that indicates the "level of difficulty" the detector has to identify the OD position and shape.
This paper reports on the implementation of a GPUbased, real-time eye blink detector on very low contrast images acquired under near-infrared illumination. This detector is part of a multi-sensor data acquisition and analysis system for driver performance assessment and training. Eye blinks are detected inside regions of interest that are aligned with the subject's eyes at initialization. Alignment is maintained through time by tracking SIFT feature points that are used to estimate the affine transformation between the initial face pose and the pose in subsequent frames. The GPU implementation of the SIFT feature point extraction algorithm ensures real-time processing. An eye blink detection rate of 97% is obtained on a video dataset of 33,000 frames showing 237 blinks from 22 subjects.
The paper reports about the development of a software module that allows autonomous object detection, recognition and tracking in outdoor urban environment. The purpose of the project was to endow a commercial PTZ camera with object tracking and recognition capability to automate some surveillance tasks. The module can discriminate between various moving objects and identify the presence of pedestrians or vehicles, track them, and zoom on them, in near real-time. The paper gives an overview of the module characteristics and its operational uses within the commercial system.
This paper presents the status of a R&D project targeting the development of computer-vision tools to assist humans in generating and rendering video description for people with vision loss. Three principal issues are discussed: (1) production practices, (2) needs of people with vision loss, and (3) current system design, core technologies and implementation. The paper provides the main conclusions of consultations with producers of video description regarding their practices and with end-users regarding their needs, as well as an analysis of described productions that lead to propose a video description typology. The current status of a prototype software is also presented (audio-vision manager) that uses many computer-vision technologies (shot transition detection, key-frame identification, key-face recognition, key-text spotting, visual motion, gait/gesture characterization, keyplace identification, key-object spotting and image categorization) to automatically extract visual content, associate textual descriptions and add them to the audio track with a synthetic voice. A proof of concept is also briefly described for a first adaptive video description player which allows end users to select various levels of video description.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.