2019
DOI: 10.1109/tvcg.2018.2865230
|View full text |Cite
|
Sign up to set email alerts
|

NLIZE: A Perturbation-Driven Visual Interrogation Tool for Analyzing and Interpreting Natural Language Inference Models

Abstract: With the recent advances in deep learning, neural network models have obtained state-of-the-art performances for many linguistic tasks in natural language processing. However, this rapid progress also brings enormous challenges. The opaque nature of a neural network model leads to hard-to-debug-systems and difficult-to-interpret mechanisms. Here, we introduce a visualization system that, through a tight yet flexible integration between visualization elements and the underlying model, allows a user to interroga… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
32
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 42 publications
(32 citation statements)
references
References 19 publications
0
32
0
Order By: Relevance
“…The data type(s) used in each system is another relevant dimension for comparing human‐centered evaluations. The predominant data types are multivariate data (e.g., [PNKC20; WMJ∗19; XMT∗20]); text (e.g., [ARO∗17; ESD∗19; LLL∗19]); and images (e.g., [CRH∗19; LSL∗17; SSSE20]). Only very few papers use other data types like videos (e.g., [KAY∗19]) or geo data (e.g., [PZDD19]).…”
Section: Dimensions Of Analysismentioning
confidence: 99%
See 1 more Smart Citation
“…The data type(s) used in each system is another relevant dimension for comparing human‐centered evaluations. The predominant data types are multivariate data (e.g., [PNKC20; WMJ∗19; XMT∗20]); text (e.g., [ARO∗17; ESD∗19; LLL∗19]); and images (e.g., [CRH∗19; LSL∗17; SSSE20]). Only very few papers use other data types like videos (e.g., [KAY∗19]) or geo data (e.g., [PZDD19]).…”
Section: Dimensions Of Analysismentioning
confidence: 99%
“…Explore – Exploration is the task most frequently referenced in evaluation descriptions. Many papers primarily focused on case studies (e.g, [XXM∗19]), use cases (e.g., [CMQ20; LLL∗19]) and questionnaires (e.g., [WSW∗18]). Stahnke et al introduced probing as “ a general interaction approach for information visualization that is aimed at both exploring the data as well as examining its representation ” [SDMT16].…”
Section: Evaluating the Technique Contributions Of Hcmlmentioning
confidence: 99%
“…Ultimately, from the further analysis of the statistics, we conclude that in five cases both domain and ML experts used visualization tools and evaluated them. In 32 cases, e.g., [CS14, KCK*19, LLL*19, MvW11, XYC*18], only domain experts were asked; and in 19 cases only ML experts participated, such as in [LSC*18, NHP*18].…”
Section: In‐depth Categorization Of Trust Against Facets Of Interamentioning
confidence: 99%
“…The multi-level matching attention of our work is closely related to the issue of sequences matching. The task of sequences matching aims to compare two sequences and identify the relationship between them, such as paraphrase identification [32], natural language inference [33] and answer sentence selection [34]. In these domains, without fine-grained interaction, learning the representation of each sequence independently has been proven to be outperformed by that of attention mechanism with sophisticated interaction.…”
Section: Sequence Matching and Interactionmentioning
confidence: 99%