Self-supervised Generative Adversarial Network for Depth Estimation in Laparoscopic Images

Huang, Baoru; Zheng, Jian-Qing; Nguyen, Anh; Tuch, David S.; Vyas, Kunal; Giannarou, Stamatia; Elson, Daniel S.

doi:10.1007/978-3-030-87202-1_22

Cited by 28 publications

(17 citation statements)

References 27 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…L ss−disp (7) is a combination of the photometric loss L ph (8), the structural similarity image metric loss L ssim (9), and the disparity smoothness loss L smooth (11).…”

Section: B Training Modes 1) Pre-trainingmentioning

confidence: 99%

“…L smooth (11) penalizes sharp disparity transitions in the absence of edges in I l using an edge-aware disparity smoothness term.…”

Section: B Training Modes 1) Pre-trainingmentioning

confidence: 99%

“…In L smooth (11), |•| computes the absolute value and ||•|| computes the mean across the 3 colour channels. Additionally, while computing (11), we normalize d by dividing its values with the max disparity search range (320).…”

Section: B Training Modes 1) Pre-trainingmentioning

confidence: 99%

“…Meanwhile, 3D geometry estimation models from stereo have seen less active development but recent additions to the available datasets for training and validation may facilitate learning-based disparity estimation [10]. Furthermore, self-supervised disparity estimation approaches have been developed and aim to alleviate the need for depth ground truth data [11], [12], [13]. Related approaches also focus on the development of both camera motion estimation and 3D reconstruction [14], [15] and on the estimation of depth using monocular endoscopes [16].…”

mentioning

confidence: 99%

See 3 more Smart Citations

MSDESIS: Multitask Stereo Disparity Estimation and Surgical Instrument Segmentation

Psychogyios

Mazomenos

Vasconcelos

et al. 2022

IEEE Trans. Med. Imaging

View full text Add to dashboard Cite

Reconstructing the 3D geometry of the surgical site and detecting instruments within it are important tasks for surgical navigation systems and robotic surgery automation. Traditional approaches treat each problem in isolation and do not account for the intrinsic relationship between segmentation and stereo matching. In this paper, we present a learning-based framework that jointly estimates disparity and binary tool segmentation masks. The core component of our architecture is a shared feature encoder which allows strong interaction between the aforementioned tasks. Experimentally, we train two variants of our network with different capacities and explore different training schemes including both multi-task and single-task learning. Our results show that supervising the segmentation task improves our network's disparity estimation accuracy. We demonstrate a domain adaptation scheme where we supervise the segmentation task with monocular data and achieve domain adaptation of the adjacent disparity task, reducing disparity End-Point-Error and depth mean absolute error by 77.73% and 61.73% respectively compared to the pre-trained baseline model. Our best overall multi-task model, trained with both disparity and segmentation data in subsequent phases, achieves 89.15% mean Intersection-over-Union in RIS and 3.18 millimetre depth mean absolute error in SCARED test sets. Our proposed multi-task architecture is real-time, able to process (1280x1024) stereo input and simultaneously estimate disparity maps and segmentation masks at 22 frames per second. The model code and pre-trained models are made available: https://github.com/dimitrisPs/msdesis

show abstract

“…L ss−disp (7) is a combination of the photometric loss L ph (8), the structural similarity image metric loss L ssim (9), and the disparity smoothness loss L smooth (11).…”

Section: B Training Modes 1) Pre-trainingmentioning

confidence: 99%

“…L smooth (11) penalizes sharp disparity transitions in the absence of edges in I l using an edge-aware disparity smoothness term.…”

Section: B Training Modes 1) Pre-trainingmentioning

confidence: 99%

Section: B Training Modes 1) Pre-trainingmentioning

confidence: 99%

mentioning

confidence: 99%

See 2 more Smart Citations

MSDESIS: Multitask Stereo Disparity Estimation and Surgical Instrument Segmentation

Psychogyios

Mazomenos

Vasconcelos

et al. 2022

IEEE Trans. Med. Imaging

View full text Add to dashboard Cite

show abstract

“…Researchers then formulate the depth estimation as an image reconstruction problem with pixels storing range values. With the success of CNNs [14,4,11,6] and Transformer [13], further efforts have been made for better exploiting discriminative information that is valuable to depth estimation.…”

Section: Introductionmentioning

confidence: 99%

3D endoscopic depth estimation using 3D surface-aware constraints

Shang¹,

Wang²,

Liu³

et al. 2022

Preprint

View full text Add to dashboard Cite

Robotic-assisted surgery allows surgeons to conduct precise surgical operations with stereo vision and flexible motor control. However, the lack of 3D spatial perception limits situational awareness during procedures and hinders mastering surgical skills in the narrow abdominal space. Depth estimation, as a representative perception task, is typically defined as an image reconstruction problem. In this work, we show that depth estimation can be reformed from a 3D surface perspective. We propose a loss function for depth estimation that integrates the surfaceaware constraints, leading to a faster and better convergence with the valid information from spatial information. In addition, camera parameters are incorporated into the training pipeline to increase the control and transparency of the depth estimation. We also integrate a specularity removal module to recover more buried image information. Quantitative experimental results on endoscopic datasets and user studies with medical professionals demonstrate the effectiveness of our method.

show abstract