ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2021
DOI: 10.1109/icassp39728.2021.9413949
|View full text |Cite
|
Sign up to set email alerts
|

Lightweight Dual-Task Networks For Crowd Counting In Aerial Images

Abstract: As a research hotspot of computer vision, crowd counting methods have achieved success in natural images. But crowd counting in aerial images are rarely explored, and existing methods do not perform well because of the higher resolution, smaller object scale and more complex scene. Therefore, this paper proposes a lightweight dual-task network (LDNet) for crowd counting, which only uses bifurcated structure to overcome these new challenges in aerial images without complicated pipelines. To realize this, a comp… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(2 citation statements)
references
References 20 publications
0
2
0
Order By: Relevance
“…Compared to the PPAM block in our model, our method takes fewer computations to process scaleaware information and the Swin-Transformer block is more effective in extracting global context information. Tian et al [41] and Zhang et al [42] put forward guidance branch to their lightweight model to learn localization information in their work; such a technique needs precise head location coordinates to guide location task and that is not mandatory in lightweight crowd-counting tasks. When it comes to hybrid network architecture, Sun et al [43] introduce Transformer blocks after each downscale convolution block to separately model scale-varied information stage by stage; it is not computationally efficient to introduce multiple Transformer blocks for the same thing.…”
Section: Lightweight Crowd-counting Modelsmentioning
confidence: 99%
See 1 more Smart Citation
“…Compared to the PPAM block in our model, our method takes fewer computations to process scaleaware information and the Swin-Transformer block is more effective in extracting global context information. Tian et al [41] and Zhang et al [42] put forward guidance branch to their lightweight model to learn localization information in their work; such a technique needs precise head location coordinates to guide location task and that is not mandatory in lightweight crowd-counting tasks. When it comes to hybrid network architecture, Sun et al [43] introduce Transformer blocks after each downscale convolution block to separately model scale-varied information stage by stage; it is not computationally efficient to introduce multiple Transformer blocks for the same thing.…”
Section: Lightweight Crowd-counting Modelsmentioning
confidence: 99%
“…Just like previous work [15,39,41], we simply use mean absolute error (MAE) and mean square error (MSE) to evaluate our model; they are defined respectively as:…”
Section: Evaluation Metricsmentioning
confidence: 99%