2018
DOI: 10.1007/978-3-030-01240-3_13
|View full text |Cite
|
Sign up to set email alerts
|

Find and Focus: Retrieve and Localize Video Events with Natural Language Queries

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
36
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
5
5

Relationship

1
9

Authors

Journals

citations
Cited by 86 publications
(36 citation statements)
references
References 35 publications
0
36
0
Order By: Relevance
“…1. It has also drawn great attention from industry due to its various applications such as video question answering Lei et al, 2018), video content retrieval Shao et al, 2018), and human-computer interaction , etc.…”
Section: Introductionmentioning
confidence: 99%
“…1. It has also drawn great attention from industry due to its various applications such as video question answering Lei et al, 2018), video content retrieval Shao et al, 2018), and human-computer interaction , etc.…”
Section: Introductionmentioning
confidence: 99%
“…Considering the naive proposal method provides no flexible window size, it cannot locate moment more accurately. In view of this, Shao et al [37] proposed to use the correlation between each clip and sentence to select candidate windows. The Query-guided Segment Proposal Network (QSPN) proposed by Xu et al [38] is also along the same lines, and they also added video captioning as a secondary auxiliary to help training.…”
Section: A Supervised Methodsmentioning
confidence: 99%
“…Current mainstream trackers [10,11,12,129,130,131,132,133,134,135] adopt tracking-by-detection (TBD) by first performing per-frame detection and then associating the detected boxes in the temporal dimension. Current works [13,136,137,138,139] leverage trajectories or tubes to capture motion trails of targets. MOT variants include e.g., video object segmentation (VOS) [14,15], video instance segmentation (VIS) [16], multi-object tracking and segmentation (MOTS) [140] and video panoptic segmentation (VPS) [141].…”
Section: Object Trackingmentioning
confidence: 99%