2020
DOI: 10.1109/tits.2019.2939832
|View full text |Cite
|
Sign up to set email alerts
|

Restricted Deformable Convolution-Based Road Scene Semantic Segmentation Using Surround View Cameras

Abstract: Understanding the surrounding environment of the vehicle is still one of the challenges for autonomous driving. This paper addresses 360-degree road scene semantic segmentation using surround view cameras, which are widely equipped in existing production cars. First, in order to address large distortion problem in the fisheye images, Restricted Deformable Convolution (RDC) is proposed for semantic segmentation, which can effectively model geometric transformations by learning the shapes of convolutional filter… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
59
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
2
2

Relationship

0
8

Authors

Journals

citations
Cited by 119 publications
(59 citation statements)
references
References 40 publications
0
59
0
Order By: Relevance
“…This conversion from the 2D to the 360 o domain is accomplished by enforcing consistency between the predictions of the 2D projected views and those in the 360 o image. Moreover, recent work on convolutions [26,27] that in addition to learning their weights also learn their shape, are very well suited for learning the distortion model of spherical images, even though they have only been applied to fisheye lenses up to now [28]. Finally, very recently, Spherical CNNs were proposed in [29,30] that are based in a rotation-equivariant definition of spherical cross-correlation.…”
Section: Learning For 360 O Imagesmentioning
confidence: 99%
“…This conversion from the 2D to the 360 o domain is accomplished by enforcing consistency between the predictions of the 2D projected views and those in the 360 o image. Moreover, recent work on convolutions [26,27] that in addition to learning their weights also learn their shape, are very well suited for learning the distortion model of spherical images, even though they have only been applied to fisheye lenses up to now [28]. Finally, very recently, Spherical CNNs were proposed in [29,30] that are based in a rotation-equivariant definition of spherical cross-correlation.…”
Section: Learning For 360 O Imagesmentioning
confidence: 99%
“…On this basis, they built a panoramic view system by utilizing four fisheye cameras and proposed Restricted Deformable Convolution (RDC) for semantic segmentation which shows the decent effectiveness in handling images with large distortions. 16 W. Zhou et al 17 stitched semantic images via a lens array which contains three 100-degree FoV lens with varying orientations to attain semantic understanding of the wider FoV, even so they still only achieve the 180-degree semantic perception of the forward-view environment. R. Varga et al 18 used four fisheye cameras, four 360-degree LIDARs and a GPS/IMU sensor to set a super-sensor which is able to complete 360-degree surroundings understanding when having this kind of super-sensor fitted to an automatic vehicle.…”
Section: Related Workmentioning
confidence: 99%
“…Due to the attractive qualities such as low power consumption and high mobility, researchers have been keen to examine the possibility of building smaller architectures. MobileNets [17], ICNet [18], ERFnet [19], have shown reasonable accuracy and real-time performance on modern GPUs. However, none of these methods achieve inference at frame rate on an NVIDIA-TX2.…”
Section: Related Workmentioning
confidence: 99%