Free space estimation is an important problem for autonomous robot navigation. Traditional camera-based approaches rely on pixel-wise ground truth annotations to train a segmentation model. To cover the wide variety of environments and lighting conditions encountered on roads, training supervised models requires large datasets. This makes the annotation cost prohibitively high. In this work, we propose a novel approach for obtaining free space estimates from images taken with a single road-facing camera. We rely on a technique that generates weak free space labels without any supervision, which are then used as ground truth to train a segmentation model for free space estimation. We study the impact of different data augmentation techniques on the performances of free space predictions, and propose to use a recursive training strategy. Our results are benchmarked using the Cityscapes dataset and improve over comparable published work across all evaluation metrics. Our best model reaches 83.64% IoU (+2.3%), 91.75% Precision (+2.4%) and 91.29% Recall (+0.4%). These results correspond to 88.8% of the IoU, 94.3% of the Precision and 93.1% of the Recall obtained by an equivalent fully-supervised baseline, while using no ground truth annotation. Our code and models are freely available online.
Free space estimation is an important problem for autonomous robot navigation. Traditional camera-based approaches train a segmentation model using an annotated dataset. The training data needs to capture the wide variety of environments and weather conditions encountered at runtime, making the annotation cost prohibitively high. In this work, we propose a novel approach for obtaining free space estimates from images taken with a single roadfacing camera. We rely on a technique that generates weak free space labels without any supervision, which are then used as ground truth to train a segmentation model for free space estimation. Our work differs from prior attempts by explicitly taking label noise into account through the use of Co-Teaching. Since Co-Teaching has traditionally been investigated in classification tasks, we adapt it for segmentation and examine how its parameters affect performances in our experiments. In addition, we propose Stochastic Co-Teaching, which is a novel method to select clean samples that leads to enhanced results. We achieve an IoU of 82.6%, a Precision of 90.9%, and a Recall of 90.3%. Our best model reaches 87% of the IoU, 93% of the Precision, and 93% of the Recall of the equivalent fully-supervised baseline while using no human annotations. To the best of our knowledge, this work is the first to use Co-Teaching to train a free space segmentation model under explicit label noise. Our implementation and models are freely available online.
Environmental perception is a key element of autonomous driving because the information received from the perception module influences core driving decisions. An outstanding challenge in real-time perception for autonomous driving lies in finding the best trade-off between detection quality and latency. Major constraints on both computation and power have to be taken into account for real-time perception in autonomous vehicles. Larger object detection models tend to produce the best results, but are also slower at runtime. Since the most accurate detectors cannot run in real-time locally, we investigate the possibility of offloading computation to edge and cloud platforms, which are less resource-constrained. We create a synthetic dataset to train object detection models and evaluate different offloading strategies. Using real hardware and network simulations, we compare different trade-offs between prediction quality and end-to-end delay. Since sending raw frames over the network implies additional transmission delays, we also explore the use of JPEG and H.265 compression at varying qualities and measure their impact on prediction metrics. We show that models with adequate compression can be run in real-time on the cloud while outperforming local detection performance.
Identifying traversable space is one of the most important problems in autonomous robot navigation and is primarily tackled using learning-based methods. To alleviate the prohibitively high annotation-cost associated with labeling large and diverse datasets, research has recently shifted from traditional supervised methods to focus on unsupervised and semi-supervised approaches. This work focuses on monocular road segmentation and proposes a practical, generic, and minimally-supervised approach based on task-specific feature extraction and pseudo-labeling. Building on recent advances in monocular depth estimation models, we process approximate dense depth maps to estimate pixel-wise road-plane distance maps. These maps are then used in both unsupervised and semi-supervised road segmentation scenarios. In the unsupervised case, we propose a pseudo-labeling pipeline that reaches state-ofthe-art Intersection-over-Union (IoU), while reducing complexity and computations compared to existing approaches. We also investigate a semi-supervised extension to our method and find that even minimal labeling efforts can greatly improve results. Our semi-supervised experiments using as little as 1% & 10% of ground truth data, yield models scoring 0.9063 & 0.9332 on the IoU metric respectively. These results correspond to a comparative performance of 95.9% & 98.7% of a fully-supervised model's IoU score, which motivates a pragmatic approach to labeling.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.