Figure 1: Overview of SANVis. (A) The network view displays multiple attention patterns for each layer according to three type of visualization options: (A-1) the attention piling option, (A-2) the Sankey diagram option, and (A-3) the small multiples option. (A-4) The bar chart shows the average attention weights for all heads (each colored with its corresponding hue) per each layer. (B) The HeadLens view helps the user analyze what the attention head learned by showing representative words and by providing statistical information of part-of-speech tags and positions.
ABSTRACTAttention networks, a deep neural network architecture inspired by humans' attention mechanism, have seen significant success in image captioning, machine translation, and many other applications. Recently, they have been further evolved into an advanced approach called multi-head self-attention networks, which can encode a set of input vectors, e.g., word vectors in a sentence, into another set of vectors. Such encoding aims at simultaneously capturing diverse syntactic and semantic features within a set, each of which corresponds to a particular attention head, forming altogether multi-head attention. Meanwhile, the increased model complexity prevents users from easily understanding and manipulating the inner workings of models. To tackle the challenges, we present a visual analytics system called SANVis, which helps users understand the behaviors and the characteristics of multi-head self-attention networks. Using a state-of-the-art self-attention model called Transformer, we demonstrate usage scenarios of SANVis in machine translation tasks. Our system is available at http://short.sanvis.org.
Hawliau Cyffredinol / General rightsCopyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.• Users may download and print one copy of any publication from the public portal for the purpose of private study or research.• You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the public portal ?
Take down policyIf you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim.
Interactive visualization editors empower people to author visualizations without writing code, but do not guide them in the art and craft of effective visual communication. In this paper, we explore the potential for using an off-the-shelf Large Language Model (LLM) to provide actionable and customized feedback to visualization designers. Our implementation, called VISUALIZATIONARY, showcases how ChatGPT can be used in this manner using two components: a preamble of visualization design guidelines and a suite of perceptual filters extracting salient metrics from a visualization image. We present findings from a longitudinal user study involving 13 visualization designers-6 novices, 4 intermediate ones, and 3 experts-authoring a new visualization from scratch over the course of several days. Our results indicate that providing guidance in natural language using an LLM can aid even seasoned designers in refining their visualizations. All supplemental materials accompanying this paper are available at https://osf.io/v7hu8.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.