2019 International Conference on Robotics and Automation (ICRA) 2019
DOI: 10.1109/icra.2019.8793683
|View full text |Cite
|
Sign up to set email alerts
|

Large-Scale Object Mining for Object Discovery from Unlabeled Video

Abstract: This paper addresses the problem of object discovery from unlabeled driving videos captured in a realistic automotive setting. Identifying recurring object categories in such raw video streams is a very challenging problem. Not only do object candidates first have to be localized in the input images, but many interesting object categories occur relatively infrequently. Object discovery will therefore have to deal with the difficulties of operating in the long tail of the object distribution. We demonstrate the… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
21
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
5
1
1

Relationship

2
5

Authors

Journals

citations
Cited by 30 publications
(21 citation statements)
references
References 76 publications
(181 reference statements)
0
21
0
Order By: Relevance
“…We further use a triplet-loss based ReID embedding network to calculate a ReID embedding vector for each mask proposal. We use the feature embedding network proposed in [24]. This is based on a wide ResNet variant [32] pre-trained on ImageNet [6] and then trained on the COCO dataset [19] using cropped bounding boxes resized to 128 × 128 pixels.…”
Section: Reid Embedding Vectorsmentioning
confidence: 99%
“…We further use a triplet-loss based ReID embedding network to calculate a ReID embedding vector for each mask proposal. We use the feature embedding network proposed in [24]. This is based on a wide ResNet variant [32] pre-trained on ImageNet [6] and then trained on the COCO dataset [19] using cropped bounding boxes resized to 128 × 128 pixels.…”
Section: Reid Embedding Vectorsmentioning
confidence: 99%
“…We begin our analysis with previous trackers [73,62,21], which attempt to tackle open-world tracking, but have not had adequate benchmarks to evaluate on. Unfortunately, these methods do not directly extend to the TAO-OW domain, requiring additional sensors [73] or assuming objects move [21] or are present throughout the videos [62].…”
Section: Developing and Analyzing Open-world Trackersmentioning
confidence: 99%
“…Since using all 1000 proposals for tracking is inefficient, we investigate various methods for scoring and ranking predictions in Figure 5 (left). 'Score' is the standard score of the most confident class among the known classes; 'objectness' uses the proposal score from the region proposal network in Mask R-CNN, which is trained in a classagnostic way, as used in [73]. 'bgScore' is defined as 1 − c score(c), where the sum is over all known classes, and measures how much the object is not one of the known objects.…”
Section: Proposal Generationmentioning
confidence: 99%
“…This design choice is motivated by the fact that such point trajectories are reliably estimated by optical flow techniques (Section II-B). Moreover, recent research on VPR [4], motion segmentation [17], and novelty detection [18] showed that such a point trajectory is often a stable part of the environment (e.g., landmarks).…”
Section: A Graph-based Map Segmentationmentioning
confidence: 99%