Real-time, two-way transmission of American Sign Language (ASL) video over cellular networks provides natural communication among members of the Deaf community. As a communication tool, compressed ASL video must be evaluated according to the intelligibility of the conversation, not according to conventional definitions of video quality. Guided by linguistic principles and human perception of ASL, this paper proposes a full-reference computational model of intelligibility for ASL (CIM-ASL) that is suitable for evaluating compressed ASL video. The CIM-ASL measures distortions only in regions relevant for ASL communication, using spatial and temporal pooling mechanisms that vary the contribution of distortions according to their relative impact on the intelligibility of the compressed video. The model is trained and evaluating using ground truth experimental data collected in three separate studies. The CIM-ASL provides accurate estimates of subjective intelligibility and demonstrates statistically significant improvements over computational models traditionally used to estimate video quality. The CIM-ASL is incorporated into an H.264 compliant video coding framework, creating a closed-loop encoding system optimized explicitly for ASL intelligibility. The ASL-optimized encoder achieves bitrate reductions between 10% and 42%, without reducing intelligibility, when compared to a general purpose H.264 encoder.
The subjective tests used to evaluate image and video quality estimators (QEs) are expensive and time consuming. More problematic, the majority of subjective testing is not designed to find systematic weaknesses in the evaluated QEs. As a result, a motivated attacker can take advantage of these systematic weaknesses to gain unfair monetary advantage. In this paper, we draw on some lessons of software testing to propose additional testing procedures that target a specific QE under test. These procedures supplement, but do not replace, the traditional subjective testing procedures that are currently used. The goal is to motivate the design of objective QEs which are better able to accurately characterize human quality assessment.
Given a network of N nodes with the i-th sensor's observation x i ∈ R M , the matrix containing all Euclidean distances among measurements ||x i − x j || ∀i, j ∈ {1, . . . , N} is a useful description of the data. While reconstructing a distance matrix has wide range of applications, we are particularly interested in the manifold reconstruction and its dimensionality reduction for data fusion and query. To make this map available to the all of the nodes in the network, we propose a fully decentralized consensus gossiping algorithm which is based on local neighbor communications, and does not require the existence of a central entity. The main advantage of our solution is that it is insensitive to changes in the network topology and it is fully decentralized. We describe the proposed algorithm in detail, study its complexity in terms of the number of inter-node radio transmissions and showcase its performance numerically.
Objective estimators for video are expected to estimate accurately subjective ratings provided by humans. This work presents a subjective experiment designed to acquire intelligibility ratings for a collection of compressed ASL videos. The distortions present in the experimental database are analyzed in terms of their impact on the performance of objective estimators. Distortions that do not significantly vary across space or time cannot adequately challenge traditional objective estimators, such as PSNR and RMS distortion contrast, and an objective intelligibility measure designed specifically for ASL video provides negligible improvements in prediction accuracy. Distortions that vary across space and time, affecting only localized regions in the video, are considered spatially and temporally diverse. When the distortions present in the experimental database are sufficiently diverse, the objective intelligibility measure estimates subjective ratings more accurately than PSNR and RMS distortion contrast.Index Terms-Region-of-interest coding, sign language video, video quality assessment, video quality database
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.