Temporal Segmentation of Surgical Sub-tasks through Deep Learning with Multiple Data Sources

Qin, Yidan; Pedram, Sahba Aghajani; Feyzabadi, Seyedshams; Allan, Max; McLeod, A. Jonathan; Burdick, Joel W.; Azizian, Mahdi

doi:10.48550/arxiv.2002.02921

Cited by 4 publications

(10 citation statements)

References 28 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Surgical task such as suturing can be practically modeled as a Finite State Machine (FSM), with a list of discrete states (actions and non-actions) and possible transitions between states [17]. Classically, the task is formulated as a MM and the transition probability matrix is learned from data [8,12,[18][19][20].…”

Section: Endoscopic Visionmentioning

confidence: 99%

“…This is especially useful in the prediction of surgical states, since different sources of input data have their respective strengths and weaknesses in representing states with various kinematics and visual features. Previously, we have proposed a unified model for surgical state estimation -Fusion-KVE -that incorporated multiple types of input data and exceeded the state-of-the-art state estimation performance [17]. Building on this, we explored the task of concurrent instrument path and surgical state predictions with multiple data streams and the incorporation of historic state transition sequences.…”

Section: Endoscopic Visionmentioning

confidence: 99%

“…Our model's performance was evaluated using the JIGSAWS suturing dataset [32] and the extended Robotic Intra-Operative Ultrasound (RIOUS+) imaging dataset [17]. The RIOUS+ dataset contains ultrasound imaging trials in various experimental settings (phantom, in-vivo, and cadaver) and endoscopic motion.…”

Section: Endoscopic Visionmentioning

confidence: 99%

“…Implementation details: Each endoscope video frame was resized to a 224 × 224 × 3 RGB image before being input to the VGG-16 model. The VGG-16 model was pre-trained following our previous work [17]. m = 1024 CNN features were extracted.…”

Section: Visual Feature Encodermentioning

confidence: 99%

“…It is worth noting that in order to obtain the historic sequence of surgical state s from t − T obs + 1 to t, we implemented Fusion-KVE -a unified surgical state estimation model we proposed recently [17]-instead of using the ground truth (GT) state sequence. In real-time RAS settings, the surgical state prediction model does not have access to the manually-labeled historic surgical state sequence; therefore, a state estimation model is needed to provide the historic state sequence.…”

Section: Instrument Path and Surgical State Predictionsmentioning

confidence: 99%

See 4 more Smart Citations

daVinciNet: Joint Prediction of Motion and Surgical State in Robot-Assisted Surgery

Qin

Feyzabadi

Allan

et al. 2020

Preprint

Self Cite

View full text Add to dashboard Cite

This paper presents a technique to concurrently and jointly predict the future trajectories of surgical instruments and the future state(s) of surgical subtasks in robot-assisted surgeries (RAS) using multiple input sources. Such predictions are a necessary first step towards shared control and supervised autonomy of surgical subtasks. Minute-long surgical subtasks, such as suturing or ultrasound scanning, often have distinguishable tool kinematics and visual features, and can be described as a series of fine-grained states with transition schematics. We propose daVinciNet -an end-to-end dual-task model for robot motion and surgical state predictions. daVinciNet performs concurrent end-effector trajectory and surgical state predictions using features extracted from multiple data streams, including robot kinematics, endoscopic vision, and system events. We evaluate our proposed model on an extended Robotic Intra-Operative Ultrasound (RIOUS+) imaging dataset collected on a da Vinci R Xi surgical system and the JHU-ISI Gesture and Skill Assessment Working Set (JIGSAWS). Our model achieves up to 93.85% short-term (0.5s) and 82.11% long-term (2s) state prediction accuracy, as well as 1.07mm short-term and 5.62mm long-term trajectory prediction error.

show abstract

Section: Endoscopic Visionmentioning

confidence: 99%

Section: Endoscopic Visionmentioning

confidence: 99%

Section: Endoscopic Visionmentioning

confidence: 99%

Section: Visual Feature Encodermentioning

confidence: 99%

Section: Instrument Path and Surgical State Predictionsmentioning

confidence: 99%

See 3 more Smart Citations

daVinciNet: Joint Prediction of Motion and Surgical State in Robot-Assisted Surgery

Qin

Feyzabadi

Allan

et al. 2020

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

Learning-based Fast Path Planning in Complex Environments

Liu

Tingguang

et al. 2021

2021 IEEE International Conference on Robotics and Biomimetics (ROBIO)

View full text Add to dashboard Cite

Aggregating Long-Term Context for Learning Laparoscopic and Robot-Assisted Surgical Workflows

Ban¹,

Rosman²,

Ward³

et al. 2020

Preprint

View full text Add to dashboard Cite

Analyzing surgical workflow is crucial for computers to understand surgeries. Deep learning techniques have recently been widely applied to recognize surgical workflows. Many of the existing temporal neural network models are limited in their capability to handle longterm dependencies in the data, instead relying upon strong performance of the underlying per-frame visual models. We propose a new temporal network structure that leverages task-specific network representation to collect long-term sufficient statistics that are propagated by a sufficient statistics model (SSM). We leverage our approach within an LSTM backbone for the task of surgical phase recognition and explore several choices for propagated statistics. We demonstrate superior results over existing state-of-the art segmentation and novel segmentation techniques, on two laparoscopic cholecystectomy datasets: the already published Cholec80 dataset and MGH100, a novel dataset with more challenging, yet clinically meaningful, segment labels.

show abstract

Temporal Segmentation of Surgical Sub-tasks through Deep Learning with Multiple Data Sources

Cited by 4 publications

References 28 publications

daVinciNet: Joint Prediction of Motion and Surgical State in Robot-Assisted Surgery

daVinciNet: Joint Prediction of Motion and Surgical State in Robot-Assisted Surgery

Learning-based Fast Path Planning in Complex Environments

Aggregating Long-Term Context for Learning Laparoscopic and Robot-Assisted Surgical Workflows

Contact Info

Product

Resources

About