2020
DOI: 10.48550/arxiv.2011.12450
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Sparse R-CNN: End-to-End Object Detection with Learnable Proposals

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
114
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 46 publications
(114 citation statements)
references
References 45 publications
0
114
0
Order By: Relevance
“…This include the vanilla DETR [3] method improved with 300 queries, reference points, and focal loss as described by [41] and the Deformable DETR [41]. We also presented the reported performance of RCNN-based methods [2,18,27,29,30,33] and other DETR variants [6,8,23,36,39]. From the results in Table 1, we can observe that our method consistently improves different R50-based baseline methods by around 2 points in AP using 50 epochs.…”
Section: Comparison With Different Detr Methodsmentioning
confidence: 80%
See 1 more Smart Citation
“…This include the vanilla DETR [3] method improved with 300 queries, reference points, and focal loss as described by [41] and the Deformable DETR [41]. We also presented the reported performance of RCNN-based methods [2,18,27,29,30,33] and other DETR variants [6,8,23,36,39]. From the results in Table 1, we can observe that our method consistently improves different R50-based baseline methods by around 2 points in AP using 50 epochs.…”
Section: Comparison With Different Detr Methodsmentioning
confidence: 80%
“…Despite effectiveness, such type of RoI-based refinement methodology can not be directly applied to the fully end-to-end pipeline of DETR because they rely on different optimization goals and still require NMS. More recently, some methods, like Efficient DETR [39], TSP-RCNN [30], and SparseRCNN [29], also uses RoIs to achieve improved performance with a Transformer and can also avoid the NMS. However, we argue that these methods are still based on the typical two-stage detection pipeline like Faster RCNN [27] and they only apply Transformer mainly to approximate NMS.…”
Section: Improvement Of Transformer In Computer Visionmentioning
confidence: 99%
“…The initial learning rate is 0.01, and it decreases with the factor 0.1 in the 8 th and 11 th epoch. We choose Faster RCNN with FPN and Sparse RCNN [37] for comparison.…”
Section: Methodsmentioning
confidence: 99%
“…Anchor-free approaches [15,40,49] replace the hand-crafted anchors by reference points. Recently, end-to-end detectors [10,37,50] remove the hand-crafted anchors and non-maximum suppression via bipartite matching. The implicit feature refinement introduced in this paper can be used to refine the instance features of one-stage object detectors as well.…”
Section: Object Detectionmentioning
confidence: 99%