2018
DOI: 10.48550/arxiv.1806.07538
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Towards Robust Interpretability with Self-Explaining Neural Networks

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
43
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
3
2
1

Relationship

1
5

Authors

Journals

citations
Cited by 26 publications
(43 citation statements)
references
References 0 publications
0
43
0
Order By: Relevance
“…Though [7] discusses several advantages of gradient-based methods over rationalization, they are post-hoc and cannot impose structural constraints on the explanation. Other lines of work that provide post-hoc explanations include local perturbations [25,30]; locally fitting interpretable models [1,36]; and generating explanations in the form of edits to inputs that change model prediction to the contrast case [37].…”
Section: Model Interpretability Beyond Selective Rationalizationmentioning
confidence: 99%
See 4 more Smart Citations
“…Though [7] discusses several advantages of gradient-based methods over rationalization, they are post-hoc and cannot impose structural constraints on the explanation. Other lines of work that provide post-hoc explanations include local perturbations [25,30]; locally fitting interpretable models [1,36]; and generating explanations in the form of edits to inputs that change model prediction to the contrast case [37].…”
Section: Model Interpretability Beyond Selective Rationalizationmentioning
confidence: 99%
“…where (ii) results from the definition of θ * r in Equation ( 6); (i) is due to the linearity of Lr with respect to β. More specifically Lr βπ (1) + (1 − β)π (2) , θ * r (βπ (1)…”
Section: A1 Proof To Theoremmentioning
confidence: 99%
See 3 more Smart Citations