2021
DOI: 10.1109/access.2021.3104605
|View full text |Cite
|
Sign up to set email alerts
|

Mixed-Scale Unet Based on Dense Atrous Pyramid for Monocular Depth Estimation

Abstract: Monocular depth estimation is an undirected problem, so constructing a network to predict better image depth information is an important research topic. This paper proposes a mixed-scale Unet network (MAPNet) with a dense atrous pyramid based on the coder-decoder structure widely used in computer vision. We innovatively introduce the Unet++ structure of the image segmentation network for depth estimation. We reset the number of convolutional layers of the network under the framework of the Unet++ network and i… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
5
0
2

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 7 publications
(7 citation statements)
references
References 48 publications
0
5
0
2
Order By: Relevance
“…This module contains four high-performance digital microphones (MP34DT01-M), built-in capacitive sensing elements, and an I2C interface, supports far-field voice capture, and far-field voice capture recording, and understands needs within a range of up to 5 m. Figs. 5a and 5b present the diagrams of the ReSpeaker Mic Array v2.0.1 and the module system [19], respectively. The module also improves recording quality, reduces ambient voice echo, and employs AEC to eliminate current audio output.…”
Section: Construction and Training Of Baseline Acoustic Modelmentioning
confidence: 99%
“…This module contains four high-performance digital microphones (MP34DT01-M), built-in capacitive sensing elements, and an I2C interface, supports far-field voice capture, and far-field voice capture recording, and understands needs within a range of up to 5 m. Figs. 5a and 5b present the diagrams of the ReSpeaker Mic Array v2.0.1 and the module system [19], respectively. The module also improves recording quality, reduces ambient voice echo, and employs AEC to eliminate current audio output.…”
Section: Construction and Training Of Baseline Acoustic Modelmentioning
confidence: 99%
“…Apart from an improvement in dense and boundary segmentation, the residual skip connections also exhibited robustness to noise. In a similar fashion, MAPUNet [58], inspired by UNet++ [66], and UNet 3+ [19], exploited multi-scale feature fusion and supervision for monocular depth estimation. Moreover, a UNet++ variant with residual blocks and dense gated convolution based attention [60] was used for monocular depth estimation using sparse depth measurements [64].…”
Section: Related Workmentioning
confidence: 99%
“…Devido a sua arquitetura flexível e facilmente modificável, a U-Net tem sido uma arquitetura de CNN comumente adotada como base para a construc ¸ão de novas redes na tarefa de estimativa de profundidade [29,30,31]. Saxena et al [31] combinaram a U-Net, utilizando a EfficientNet como codificador, com a reconstruc ¸ão de mapas de profundidades ruidosos para melhorar significativamente a estimativa de profundidade.…”
Section: Introduc ¸ãOunclassified
“…Jan and Seo [32] substituíram camadas simples de convoluc ¸ão 2D da U-Net por camadas de convoluc ¸ões residuais para aprimorar a extrac ¸ão de características mais expressivas, além de adicionarem um mecanismo de atenc ¸ão antes de cada camada do decodificador, permitindo que a rede também seja capaz de identificar pequenas características e de construir mapas de profundidade refinados. Yang et al [29] propuseram uma arquitetura supervisionada baseada na UNet++ (versão da U-Net que incluiu conexões aninhadas entre codificador e decodificador [14]) e adotaram pirâmides de convoluc ¸ões dilatadas para reduzir o custo computacional e ampliar a capacidade da rede para capturar características da imagem em diversas escalas. Com o intuito de desenvolver uma soluc ¸ão mais leve, Guzzo and Gazolli [33] propuseram uma abordagem para estimativa de profundidade que utiliza a arquitetura UNet++ empregando como codificador a rede neural MobileNetV2 pré-treinada.…”
Section: Introduc ¸ãOunclassified