We introduce an algorithm for automatic selection of semantically-resonant colors to represent data (e.g., using blue for data about "oceans", or pink for "love"). Given a set of categorical values and a target color palette, our algorithm matches each data value with a unique color. Values are mapped to colors by collecting representative images, analyzing image color distributions to determine value-color affinity scores, and choosing an optimal assignment. Our affinity score balances the probability of a color with how well it discriminates among data values. A controlled study shows that expert-chosen semantically-resonant colors improve speed on chart reading tasks compared to a standard palette, and that our algorithm selects colors that lead to similar gains. A second study verifies that our algorithm effectively selects colors across a variety of data categories.
In this paper, a new automatic contour tracking system, EdgeTrak, for the ultrasound image sequences of human tongue is presented. The images are produced by a head and transducer support system (HATS). The noise and unrelated high-contrast edges in ultrasound images make it very difficult to automatically detect the correct tongue surfaces. In our tracking system, a novel active contour model is developed. Unlike the classical active contour models which only use gradient of the image as the image force, the proposed model incorporates the edge gradient and intensity information in local regions around each snake element. Different from other active contour models that use homogeneity of intensity in a region as the constraint and thus are only applied to closed contours, the proposed model applies local region information to open contours and can be used to track partial tongue surfaces in ultrasound images. The contour orientation is also taken into account so that any unnecessary edges in ultrasound images will be discarded. Dynamic programming is used as the optimisation method in our implementation. The proposed active contour model has been applied to human tongue tracking and its robustness and accuracy have been verified by quantitative comparison analysis to the tracking by speech scientists.
Our ability to reliably name colors provides a link between visual perception and symbolic cognition. In this paper, we investigate how a statistical model of color naming can enable user interfaces to meaningfully mimic this link and support novel interactions. We present a method for constructing a probabilistic model of color naming from a large, unconstrained set of human color name judgments. We describe how the model can be used to map between colors and names and define metrics for color saliency (how reliably a color is named) and color name distance (the similarity between colors based on naming patterns). We then present a series of applications that demonstrate how color naming models can enhance graphical interfaces: a color dictionary & thesaurus, name-based pixel selection methods for image editing, and evaluation aids for color palette design.
This article presents a segmental vocoder driven by ultrasound and optical images (standard CCD camera) of the tongue and lips for a "silent speech interface" application, usable either by a laryngectomized patient or for silent communication. The system is built around an audiovisual dictionary which associates visual to acoustic observations for each phonetic class. Visual features are extracted from ultrasound images of the tongue and from video images of the lips using a PCA-based image coding technique. Visual observations of each phonetic class are modeled by continuous HMMs. The system then combines a phone recognition stage with corpus-based synthesis. In the recognition stage, the visual HMMs are used to identify phonetic targets in a sequence of visual features. In the synthesis stage, these phonetic targets constrain the dictionary search for the sequence of diphones that maximizes similarity to the input test data in the visual space, subject to a concatenation cost in the acoustic domain. A prosody template is extracted from the training corpus, and the final speech waveform is generated using "Harmonic plus Noise Model" concatenative synthesis techniques. Experimental results are based on an audiovisual database containing one hour of continuous speech from each of two speakers.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.