2020 17th Conference on Computer and Robot Vision (CRV) 2020
DOI: 10.1109/crv50864.2020.00038
|View full text |Cite
|
Sign up to set email alerts
|

SpotNet: Self-Attention Multi-Task Network for Object Detection

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
24
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
5
2
2

Relationship

1
8

Authors

Journals

citations
Cited by 45 publications
(24 citation statements)
references
References 24 publications
0
24
0
Order By: Relevance
“…Second, there is no need to implement an NMS for the final inference, which repeatedly computes the IOU between the estimated and observed bounding boxes. As an extension of a CenterNet, some researchers developed a SpotNet model by adding a head for semantic segmentation to its architecture [22]. When training the model, the head for semantic segmentation is fed with a silhouette derived by a background subtraction method using consecutive video shoots.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Second, there is no need to implement an NMS for the final inference, which repeatedly computes the IOU between the estimated and observed bounding boxes. As an extension of a CenterNet, some researchers developed a SpotNet model by adding a head for semantic segmentation to its architecture [22]. When training the model, the head for semantic segmentation is fed with a silhouette derived by a background subtraction method using consecutive video shoots.…”
Section: Related Workmentioning
confidence: 99%
“…The performance of YOLO models ranges between 0.6 and 0.8 when measured by the mAP score. CenterNet and its extension SpotNet have also been adopted in studies of vehicle detection using the UA-DTRAC dataset [3], [22], [31]- [32]. These two key-point-based detection models have recorded mAP scores exceeding 0.8, which is higher than those for any other detector.…”
Section: Related Workmentioning
confidence: 99%
“…Attention mechanism [27–35] has been successfully applied to various tasks such as machine translation, neural image/video captioning and object detection since it could assign greater weights to highly responsive feature maps. Bahdanau et al.…”
Section: Related Workmentioning
confidence: 99%
“…Thus, even though [16] also uses an architecture based on reinforcement learning, the goal of our work is different, our objective being to produce a good spotting. Such an idea can be found for object spotting [21] in 2D images, but we believe we are the first to apply it to temporal action spotting. Our framework also does not require video-retargeting [22], [23].…”
Section: Related Workmentioning
confidence: 99%