2020
DOI: 10.1111/cgf.13962
|View full text |Cite
|
Sign up to set email alerts
|

DRLViz: Understanding Decisions and Memory in Deep Reinforcement Learning

Abstract: We present DRLViz, a visual analytics interface to interpret the internal memory of an agent (e.g. a robot) trained using deep reinforcement learning. This memory is composed of large temporal vectors updated when the agent moves in an environment and is not trivial to understand due to the number of dimensions, dependencies to past vectors, spatial/temporal correlations, and co‐correlation between dimensions. It is often referred to as a black box as only inputs (images) and outputs (actions) are intelligible… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
19
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 27 publications
(19 citation statements)
references
References 41 publications
0
19
0
Order By: Relevance
“…Papers that describe applications that do not allow user interaction with the model beyond filtering of data points (i.e., purely exploratory systems in which the user can neither influence the model behavior during the analysis session nor optimize towards a specific model output) (e.g., [JVW20; LLT∗20; XXL∗20]).…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Papers that describe applications that do not allow user interaction with the model beyond filtering of data points (i.e., purely exploratory systems in which the user can neither influence the model behavior during the analysis session nor optimize towards a specific model output) (e.g., [JVW20; LLT∗20; XXL∗20]).…”
Section: Methodsmentioning
confidence: 99%
“… Papers that provide use cases or usage scenarios developed by the authors without the inclusion of expert feedback, or case studies that do not consider human factors pertaining to the expert (e.g., [GWGvW19; KTC∗19; LJLH19; PLM∗17; SJS∗18; WPB∗20]). This was the most frequent reason for exclusion. Papers that describe applications that do not allow user interaction with the model beyond filtering of data points (i.e., purely exploratory systems in which the user can neither influence the model behavior during the analysis session nor optimize towards a specific model output) (e.g., [JVW20; LLT∗20; XXL∗20]). Papers not describing system evaluations but research agendas (e.g., [AVW∗18]), or workshops (e.g., [AW18; BCP∗19]). Papers that provide quantitative evaluations of results not generated by participants in a study setting (e.g., [BZL∗18; YDP19]). …”
Section: Methodsmentioning
confidence: 99%
“…DQNViz [42] focuses on analyzing the training of a deep RL agent, from an overview statistical level down to individual epochs. More recently, DRLViz [22] visualizes the internal memory of a Deep RL agent as a way to interpret its decisions. Likewise, DynamicsExplorer [20] is a diagonistic tool meant to look into the learnt policy under different dynamic settings.…”
Section: Visualization For Xaimentioning
confidence: 99%
“…Before Model Building Improving Data Quality (31) [3], [11], [14], [16], [17], [18], [25], [45], [61], [91], [96], [101], [102], [118], [123], [125], [136], [144], [157], [193], [202], [204], [205], [214], [228], [229], [232], [257], [259], [268], [275] Improving Feature Quality (6) [109], [132], [184], [195], [223], [239] During Model Building Model Understanding (30) [28], [38], [56], [71], [79], [84], [104], [115], [116], [119], [120], [137],…”
Section: Technique Category Papers Trendmentioning
confidence: 99%