Upright Adjustment With Graph Convolutional Networks

Jung, Raehyuk; Cho, Sungmin; Kwon, Junseok

doi:10.1109/icip40778.2020.9190715

Cited by 5 publications

(1 citation statement)

References 11 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…By contrast, Deep360Up [92] directly takes ERP image as the input and synthesizes the upright version according to the estimated up-vector orientation. In particular, Jung et al [93] proposed a two-stage pipeline for ODI upright adjustment. First, the feature map is extracted by a CNN model from the rotated ERP image.…”

Section: Upright Adjustmentmentioning

confidence: 99%

Deep Learning for Omnidirectional Vision: A Survey and New Perspectives

Ai¹,

Cao²,

Zhu³

et al. 2022

Preprint

View full text Add to dashboard Cite

To whom it may concern,In this paper, we provide an in-depth and comprehensive survey on recent technical advancements in deep learning for omnidirectional vision, and also provide new perspectives for the future direction. Deep learning (DL) has recently been applied to omnidirectional vision. DL-based omnidirectional vision methods often achieve the state-of-the-art (SoTA) performance on various benchmark datasets. Diverse deep neural network (DNN) models have been developed, ranging from the convolutions neural network (CNN)-based models to vision transformers (ViT)-based ones. Generally, the SoTA DNN-based methods differ from each other in four major aspects: convolutional filters used to extract features from the omnidirectional image (ODI) data, network design considering the input numbers and projection types, omnidirectional vision with novel learning strategies, and practical applications. In this paper, we provide a comprehensive and systematic review and analysis of the recent progress in DL methods for omnidirectional vision. There are some previous surveys in the literature. However, some of them are focused on the specific vision tasks, especially room layout reconstruction, 3D scene geometry recovery. Moreover, some provide limited reviews of omnidirectional video streaming methods. Unlike existing surveys, this paper highlights the importance of deep learning and probe the recent advances for omnidirectional vision, both methodically and comprehensively. To the best of our knowledge, this is the first survey to comprehensively review and analyze the DL methods for omnidirectional vision. The main contributions of this paper to the community are five folds: (I) We comprehensively review and analyze the DL methods for omnidirectional vision, including the omnidirectional imaging principle, representation learning, datasets, a taxonomy, and applications, to highlight the differences and difficulties with the 2D planner image data. (II) We conduct an analytical study of recent trends of DL for omnidirectional vision, both hierarchically and structurally. Moreover, we offer insights into the discussion and challenge of each category. (III) We summarize the latest novel learning strategies and potential applications for omnidirectional vision.(I) We provide insightful discussions of the challenges and open problems yet to be solved and propose the potential future directions to spur more in-depth research by the community.(II) We create an open-source repository that provides a taxonomy of all the mentioned works and code links, and hope it can shed light on future research. The organization of this paper is structured as follows. In Sec.2, we introduce the imaging principle of ODI, convolution methods for omnidirectional vision, and some representative datasets. In Sec.3, we review and analyze the existing DL approaches for various tasks and provide taxonomies to categorize the relevant papers. Sec.4 covers novel learning paradigms for the tasks in omnidirectional vision, e.g., unsupervised learning, transfer learning, and reinforcement learning. Sec.5 then scrutinizes the applications, followed by Sec. 6, where we discuss open problems and future directions. Finally, we conclude this paper in Sec. 7. Furthermore, we summarize most, if not all but representative, works (over 200 papers) in the last five years, which were published in the top-tier conferences and journals in computer vision/graphics and machine learning. Due to the lack of space, we show the experimental results in Sec. 2 of the suppl. material. We believe that this paper will bring significant interest in such crucial topics in the community and provide fundamental technical references for further research.

show abstract

Section: Upright Adjustmentmentioning

confidence: 99%