2020 IEEE 23rd International Conference on Information Fusion (FUSION) 2020
DOI: 10.23919/fusion45008.2020.9190215
|View full text |Cite
|
Sign up to set email alerts
|

VADR: Discriminative Multimodal Explanations for Situational Understanding

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2020
2020
2021
2021

Publication Types

Select...
4

Relationship

2
2

Authors

Journals

citations
Cited by 4 publications
(5 citation statements)
references
References 16 publications
0
5
0
Order By: Relevance
“…Multimodal Explanations Figure 4 shows the detail associated with a simple event where a shooting has been detected in the city. This uses the selective relevance technique (Taylor et al 2020) to highlight features in the input that were most salient to the detection of the event. Here we can see red highlighting on the person to the left, who is the active shooter (the other person is merely walking past).…”
Section: Overview Of Sue and Examplesmentioning
confidence: 99%
“…Multimodal Explanations Figure 4 shows the detail associated with a simple event where a shooting has been detected in the city. This uses the selective relevance technique (Taylor et al 2020) to highlight features in the input that were most salient to the detection of the event. Here we can see red highlighting on the person to the left, who is the active shooter (the other person is merely walking past).…”
Section: Overview Of Sue and Examplesmentioning
confidence: 99%
“…In Section II we define a simple worked example using the selective audio-visual relevance (SAVR) explainability technique -derived from the research reported in [3] -to highlight the video and audio aspects of an incoming sensor feed that most influenced a particular classification.…”
Section: B Explainabilitymentioning
confidence: 99%
“…In earlier work [3] we have defined a technique for providing explanations for the classification decision of a multimodal (audio and video) activity recognition system. The deep neural network model was trained [3] using a generic activity dataset (UCF-101 [7]) and is therefore able to output classification decisions for a wide range of human activities that can be sensed through audio-visual means such as CCTV or camera sources. These classifications will be for specific activities that are unlikely to be of particular relevance to the domain problem in which they are applied.…”
Section: Worked Examplementioning
confidence: 99%
See 2 more Smart Citations