Reliable feature correspondence between frames is a critical step in visual odometry (VO) and visual simultaneous localization and mapping (V-SLAM) algorithms. In comparison with existing VO and V-SLAM algorithms, semi-direct visual odometry (SVO) has two main advantages that lead to stateof-the-art frame rate camera motion estimation: direct pixel correspondence and efficient implementation of probabilistic mapping method. This paper improves the SVO mapping by initializing the mean and the variance of the depth at a feature location according to the depth prediction from a singleimage depth prediction network. By significantly reducing the depth uncertainty of the initialized map point (i.e., small variance centred about the depth prediction), the benefits are twofold: reliable feature correspondence between views and fast convergence to the true depth in order to create new map points. We evaluate our method with two outdoor datasets: KITTI dataset and Oxford Robotcar dataset. The experimental results indicate that the improved SVO mapping results in increased robustness and camera tracking accuracy.
There has been tremendous research progress in estimating the depth of a scene from a monocular camera image. Existing methods for single-image depth prediction are exclusively based on deep neural networks, and their training can be unsupervised using stereo image pairs, supervised using LiDAR point clouds, or semi-supervised using both stereo and LiDAR. In general, semi-supervised training is preferred as it does not suffer from the weaknesses of either supervised training, resulting from the difference in the cameras and the LiDARs field of view, or unsupervised training, resulting from the poor depth accuracy that can be recovered from a stereo pair. In this paper, we present our research in singleimage depth prediction using semi-supervised training that outperforms the state-of-the-art. We achieve this through a loss function that explicitly exploits left-right consistency in a stereo reconstruction, which has not been adopted in previous semi-supervised training. In addition, we describe the correct use of ground truth depth derived from LiDAR that can significantly reduce prediction error. The performance of our depth prediction model is evaluated on popular datasets, and the importance of each aspect of our semi-supervised training approach is demonstrated through experimental results. Our deep neural network model has been made publicly available. 1 .
Recommended by Y.-P. Tan This article explores the problem of video shot boundary detection and examines a novel shot boundary detection algorithm by using QR-decomposition and modeling of gradual transitions by Gaussian functions. Specifically, the authors attend to the challenges of detecting gradual shots and extracting appropriate spatiotemporal features that affect the ability of algorithms to efficiently detect shot boundaries. The algorithm utilizes the properties of QR-decomposition and extracts a block-wise probability function that illustrates the probability of video frames to be in shot transitions. The probability function has abrupt changes in hard cut transitions, and semi-Gaussian behavior in gradual transitions. The algorithm detects these transitions by analyzing the probability function. Finally, we will report the results of the experiments using large-scale test sets provided by the TRECVID 2006, which has assessments for hard cut and gradual shot boundary detection. These results confirm the high performance of the proposed algorithm.
In this study, we propose methods for the automatic detection of photospheric features (bright points and granules) from ultra-violet (UV) radiation, using a feature-based classifier. The methods use quiet-Sun observations at 214 nm and 525 nm images taken by Sunrise on 9 June 2009. The function of region growing and mean shift procedure are applied to segment the bright points (BPs) and granules, respectively. Zernike moments of each region are computed. The Zernike moments of BPs, granules, and other features are distinctive enough to be separated using a support vector machine (SVM) classifier.The size distribution of BPs can be fitted with a power-law slope -1.5. The peak value of granule sizes is found to be about 0.5 arcsec 2 . The mean value of the filling factor of BPs is 0.01, and for granules it is 0.51. There is a critical scale for granules so that small granules with sizes smaller than 2.5 arcsec 2 cover a wide range of brightness, while the brightness of large granules approaches unity. The mean value of BP brightness fluctuations is estimated to be 1.2, while for granules it is 0.22. Mean values of the horizontal velocities of an individual BP and an individual BP within the network were found to be 1.6 km s −1 and 0.9 km s −1 , respectively. We conclude that the effect of individual BPs in releasing energy to the photosphere and maybe the upper layers is stronger than what the individual BPs release into the network.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.