RGB-D datasets for robotic perception in site-specific agricultural operations—A survey

Kurtser, Polina; Lowry, Stephanie

doi:10.1016/j.compag.2023.108035

Cited by 7 publications

(4 citation statements)

References 69 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In an extensive literature review on the use of depth data for agricultural applications, Kurtser and Lowry [ 23 ] identified only one paper in which a combination of color and depth data was used for object detection [ 24 ]. They dealt with the use case of apple detection.…”

Section: Discussionmentioning

confidence: 99%

“…In a literature review, Kurtser and Lowry [ 23 ] identified 24 papers that applied a combination of both color and depth data (RGB-D data) for robotic perception in precision agriculture. Almost all these papers focused on object detection solely using RGB images.…”

Section: Introductionmentioning

confidence: 99%

“…One of the main reasons that RGB-D data are not frequently used for object detection in agriculture is the lack of large general RGB-D datasets to pretrain deep neural networks [ 23 ]. Therefore, it is important to investigate other methods that can facilitate better training of RGB-D models, such as data augmentation and transfer learning from the RGB domain.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Stereo Vision for Plant Detection in Dense Scenes

Ruigrok,

van Henten,

Kootstra

2024

Sensors

View full text Add to dashboard Cite

Automated precision weed control requires visual methods to discriminate between crops and weeds. State-of-the-art plant detection methods fail to reliably detect weeds, especially in dense and occluded scenes. In the past, using hand-crafted detection models, both color (RGB) and depth (D) data were used for plant detection in dense scenes. Remarkably, the combination of color and depth data is not widely used in current deep learning-based vision systems in agriculture. Therefore, we collected an RGB-D dataset using a stereo vision camera. The dataset contains sugar beet crops in multiple growth stages with a varying weed densities. This dataset was made publicly available and was used to evaluate two novel plant detection models, the D-model, using the depth data as the input, and the CD-model, using both the color and depth data as inputs. For ease of use, for existing 2D deep learning architectures, the depth data were transformed into a 2D image using color encoding. As a reference model, the C-model, which uses only color data as the input, was included. The limited availability of suitable training data for depth images demands the use of data augmentation and transfer learning. Using our three detection models, we studied the effectiveness of data augmentation and transfer learning for depth data transformed to 2D images. It was found that geometric data augmentation and transfer learning were equally effective for both the reference model and the novel models using the depth data. This demonstrates that combining color-encoded depth data with geometric data augmentation and transfer learning can improve the RGB-D detection model. However, when testing our detection models on the use case of volunteer potato detection in sugar beet farming, it was found that the addition of depth data did not improve plant detection at high vegetation densities.

show abstract

Section: Discussionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Stereo Vision for Plant Detection in Dense Scenes

Ruigrok,

van Henten,

Kootstra

2024

Sensors

View full text Add to dashboard Cite

show abstract

“…The significance of point-cloud processing has surged across various domains, such as robotics [1,2], medical field [3,4], autonomous driving [5,6], metrology [7][8][9], etc. Over the past few years, advancements in vision sensors have led to remarkable improvements, enabling these sensors to provide real-time 3D measurements of the surroundings while maintaining decent accuracy [10,11]. Consequently, point-cloud processing forms an essential pivot of numerous application by facilitating robust object detection, segmentation, and classification operations.…”

Section: Introductionmentioning

confidence: 99%

FusionVision: A Comprehensive Approach of 3D Object Reconstruction and Segmentation from RGB-D Cameras Using YOLO and Fast Segment Anything

El Ghazouali,

Mhirit,

Oukhrid

et al. 2024

Sensors

View full text Add to dashboard Cite

In the realm of computer vision, the integration of advanced techniques into the pre-processing of RGB-D camera inputs poses a significant challenge, given the inherent complexities arising from diverse environmental conditions and varying object appearances. Therefore, this paper introduces FusionVision, an exhaustive pipeline adapted for the robust 3D segmentation of objects in RGB-D imagery. Traditional computer vision systems face limitations in simultaneously capturing precise object boundaries and achieving high-precision object detection on depth maps, as they are mainly proposed for RGB cameras. To address this challenge, FusionVision adopts an integrated approach by merging state-of-the-art object detection techniques, with advanced instance segmentation methods. The integration of these components enables a holistic (unified analysis of information obtained from both color RGB and depth D channels) interpretation of RGB-D data, facilitating the extraction of comprehensive and accurate object information in order to improve post-processes such as object 6D pose estimation, Simultanious Localization and Mapping (SLAM) operations, accurate 3D dataset extraction, etc. The proposed FusionVision pipeline employs YOLO for identifying objects within the RGB image domain. Subsequently, FastSAM, an innovative semantic segmentation model, is applied to delineate object boundaries, yielding refined segmentation masks. The synergy between these components and their integration into 3D scene understanding ensures a cohesive fusion of object detection and segmentation, enhancing overall precision in 3D object segmentation.

show abstract

Accelerated Data Engine: A faster dataset construction workflow for computer vision applications in commercial livestock farms

Wu,

Zhou,

et al. 2024

Computers and Electronics in Agriculture

View full text Add to dashboard Cite

RGB-D datasets for robotic perception in site-specific agricultural operations—A survey

Cited by 7 publications

References 69 publications

Stereo Vision for Plant Detection in Dense Scenes

Stereo Vision for Plant Detection in Dense Scenes

FusionVision: A Comprehensive Approach of 3D Object Reconstruction and Segmentation from RGB-D Cameras Using YOLO and Fast Segment Anything

Accelerated Data Engine: A faster dataset construction workflow for computer vision applications in commercial livestock farms

Contact Info

Product

Resources

About