2019 IEEE/CVF International Conference on Computer Vision (ICCV) 2019
DOI: 10.1109/iccv.2019.00682
|View full text |Cite
|
Sign up to set email alerts
|

ThunderNet: Towards Real-Time Generic Object Detection on Mobile Devices

Abstract: Real-time generic object detection on mobile platforms is a crucial but challenging computer vision task. However, previous CNN-based detectors suffer from enormous computational cost, which hinders them from real-time inference in computation-constrained scenarios. In this paper, we investigate the effectiveness of two-stage detectors in real-time generic detection and propose a lightweight twostage detector named ThunderNet. In the backbone part, we analyze the drawbacks in previous lightweight backbones and… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
142
0

Year Published

2020
2020
2021
2021

Publication Types

Select...
5
5

Relationship

0
10

Authors

Journals

citations
Cited by 240 publications
(142 citation statements)
references
References 31 publications
0
142
0
Order By: Relevance
“…This is because these two detection heads are some of the first and seminal works on end-to-end trainable detection heads from the R-CNN family and do not require multi-stage progressive training such as R-CNN [27] and Fast R-CNN [28]. Furthermore, prior research [29], [37], [38] has demonstrated that region-proposal based detection heads are typically more accurate than unified framework based detection heads.…”
Section: Choice Of Object Detection Headsmentioning
confidence: 99%
“…This is because these two detection heads are some of the first and seminal works on end-to-end trainable detection heads from the R-CNN family and do not require multi-stage progressive training such as R-CNN [27] and Fast R-CNN [28]. Furthermore, prior research [29], [37], [38] has demonstrated that region-proposal based detection heads are typically more accurate than unified framework based detection heads.…”
Section: Choice Of Object Detection Headsmentioning
confidence: 99%
“…Moreover, it consists of a Context Enhancement Module (CEM) and a Mobile Spatial Attention Module (MSAM). The key idea of CEM that leverages semantic and context information from multiple scales is to aggregate multi-scale local and global information to produce more discriminating features and the receptive field size plays an important role in CNN models [32]. CNNs can only capture information inside the receptive field.…”
Section: Plos Onementioning
confidence: 99%
“…In terms of simplifying neural networks, Mobilenet [23] is a lightweight neural network proposed by Google for mobile devices, which effectively reduces the amount of parameters and calculations. ShuffleNet [24], PeleeNet [25] and ThunderNet [26] enable the network model to be further optimized and become smaller and faster.…”
Section: Image Classification Networkmentioning
confidence: 99%