2022
DOI: 10.1109/tmi.2022.3147640
|View full text |Cite
|
Sign up to set email alerts
|

Gesture Recognition in Robotic Surgery With Multimodal Attention

Abstract: Automatically recognising surgical gestures from surgical data is an important building block of automated activity recognition and analytics, technical skill assessment, intra-operative assistance and eventually robotic automation. The complexity of articulated instrument trajectories and the inherent variability due to surgical style and patient anatomy make analysis and fine-grained segmentation of surgical motion patterns from robot kinematics alone very difficult. Surgical video provides crucial informati… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
14
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 40 publications
(24 citation statements)
references
References 46 publications
0
14
0
Order By: Relevance
“…Such explainability, which we systematically investigate in a concurrent study 17 is critical to gaining the trust of surgeons and ensuring the safe deployment of AI systems for high-stakes decision making such as skill-based surgeon credentialing. This is in contrast to previous AI systems such as MA-TCN 12 , which is only capable of highlighting the relative importance of data modalities (for example, images versus kinematics), and therefore lacks the finer level of explainability of SAIS. SAIS is also flexible in that it can accept video samples with an arbitrary number of video frames as input, primarily due to its transformer architecture.…”
Section: Discussionmentioning
confidence: 68%
See 3 more Smart Citations
“…Such explainability, which we systematically investigate in a concurrent study 17 is critical to gaining the trust of surgeons and ensuring the safe deployment of AI systems for high-stakes decision making such as skill-based surgeon credentialing. This is in contrast to previous AI systems such as MA-TCN 12 , which is only capable of highlighting the relative importance of data modalities (for example, images versus kinematics), and therefore lacks the finer level of explainability of SAIS. SAIS is also flexible in that it can accept video samples with an arbitrary number of video frames as input, primarily due to its transformer architecture.…”
Section: Discussionmentioning
confidence: 68%
“…Validating on external video datasets. To contextualize our work with previous methods, we also trained SAIS to distinguish between suturing gestures on two publicly available datasets: JHU-ISI gesture and skill assessment working set ( JIGSAWS) 11 and dorsal vascular complex University College London (DVC UCL) 12 (Methods). While the former contains videos of participants in a laboratory setting, the latter contains videos of surgeons in a particular step (dorsal vascular complex) of the live robot-assisted radical prostatectomy (RARP) procedure.…”
Section: Generalizing Across Hospitalsmentioning
confidence: 99%
See 2 more Smart Citations
“…34 However, the nonlinear, highly variable nature of surgical motions poses significant challenges to analysis. 35 Additionally, many surgical tools have similar shapes and can be obscured from view by hand positioning. 36 Advancements in deep learning, specifically region-based convolutional neural networks (R-CNNs), have worked to address recognition challenges by identifying regions of interest with selective search to help localize objects of interest.…”
Section: Introductionmentioning
confidence: 99%