2021
DOI: 10.48550/arxiv.2111.11044
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Exploring Segment-level Semantics for Online Phase Recognition from Surgical Videos

Abstract: Automatic surgical phase recognition plays an important role in robot-assisted surgeries. Existing methods ignored a pivotal problem that surgical phases should be classified by learning segment-level semantics instead of solely relying on frame-wise information. In this paper, we present a segmentattentive hierarchical consistency network (SAHC) for surgical phase recognition from videos. The key idea is to extract hierarchical high-level semantic-consistent segments and use them to refine the erroneous predi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1

Relationship

1
0

Authors

Journals

citations
Cited by 1 publication
(2 citation statements)
references
References 31 publications
(58 reference statements)
0
2
0
Order By: Relevance
“…We use the ResNet50 [11] as the backbone to extract features for each frame. After that, following [24,28,6], a multi-stage temporal convolution (MS-TCN) [8] is used to extracted temporal relations for frame features and predicts phase results. Following the same evaluation protocols [24,13,14,15], we employ four commonly-used metrics, i.e., accuracy (AC), precision (PR), recall (RE), and Jaccard (JA) to evaluate the phase prediction accuracy.…”
Section: Implementation Detailsmentioning
confidence: 99%
See 1 more Smart Citation
“…We use the ResNet50 [11] as the backbone to extract features for each frame. After that, following [24,28,6], a multi-stage temporal convolution (MS-TCN) [8] is used to extracted temporal relations for frame features and predicts phase results. Following the same evaluation protocols [24,13,14,15], we employ four commonly-used metrics, i.e., accuracy (AC), precision (PR), recall (RE), and Jaccard (JA) to evaluate the phase prediction accuracy.…”
Section: Implementation Detailsmentioning
confidence: 99%
“…on the target datasets, has achieved great success for image recognition and video understanding [10,4,2,19]. This paper focuses on designing self-supervised learning methods for surgical video understanding with a downstream tasksurgical phase recognition, which aims to predict what phase is occurring for each frame in a video [1,29,13,14,15,6,9,25,24,28]. Self-supervised learning has been widely applied into various medical images, such as X-ray [30], fundus images [16,17], CT [34] and MRI [32,30].…”
Section: Introductionmentioning
confidence: 99%