Search citation statements
Paper Sections
Citation Types
Year Published
Publication Types
Relationship
Authors
Journals
Objective skill assessment-based personal performance feedback is a vital part of surgical training. Either kinematic—acquired through surgical robotic systems, mounted sensors on tooltips or wearable sensors—or visual input data can be employed to perform objective algorithm-driven skill assessment. Kinematic data have been successfully linked with the expertise of surgeons performing Robot-Assisted Minimally Invasive Surgery (RAMIS) procedures, but for traditional, manual Minimally Invasive Surgery (MIS), they are not readily available as a method. 3D visual features-based evaluation methods tend to outperform 2D methods, but their utility is limited and not suited to MIS training, therefore our proposed solution relies on 2D features. The application of additional sensors potentially enhances the performance of either approach. This paper introduces a general 2D image-based solution that enables the creation and application of surgical skill assessment in any training environment. The 2D features were processed using the feature extraction techniques of a previously published benchmark to assess the attainable accuracy. We relied on the JHU–ISI Gesture and Skill Assessment Working Set dataset—co-developed by the Johns Hopkins University and Intuitive Surgical Inc. Using this well-established set gives us the opportunity to comparatively evaluate different feature extraction techniques. The algorithm reached up to 95.74% accuracy in individual trials. The highest mean accuracy—averaged over five cross-validation trials—for the surgical subtask of Knot-Tying was 83.54%, for Needle-Passing 84.23% and for Suturing 81.58%. The proposed method measured well against the state of the art in 2D visual-based skill assessment, with more than 80% accuracy for all three surgical subtasks available in JIGSAWS (Knot-Tying, Suturing and Needle-Passing). By introducing new visual features—such as image-based orientation and image-based collision detection—or, from the evaluation side, utilising other Support Vector Machine kernel methods, tuning the hyperparameters or using other classification methods (e.g., the boosted trees algorithm) instead, classification accuracy can be further improved. We showed the potential use of optical flow as an input for RAMIS skill assessment, highlighting the maximum accuracy achievable with these data by evaluating it with an established skill assessment benchmark, by evaluating its methods independently. The highest performing method, the Residual Neural Network, reached means of 81.89%, 84.23% and 83.54% accuracy for the skills of Suturing, Needle-Passing and Knot-Tying, respectively.
Objective skill assessment-based personal performance feedback is a vital part of surgical training. Either kinematic—acquired through surgical robotic systems, mounted sensors on tooltips or wearable sensors—or visual input data can be employed to perform objective algorithm-driven skill assessment. Kinematic data have been successfully linked with the expertise of surgeons performing Robot-Assisted Minimally Invasive Surgery (RAMIS) procedures, but for traditional, manual Minimally Invasive Surgery (MIS), they are not readily available as a method. 3D visual features-based evaluation methods tend to outperform 2D methods, but their utility is limited and not suited to MIS training, therefore our proposed solution relies on 2D features. The application of additional sensors potentially enhances the performance of either approach. This paper introduces a general 2D image-based solution that enables the creation and application of surgical skill assessment in any training environment. The 2D features were processed using the feature extraction techniques of a previously published benchmark to assess the attainable accuracy. We relied on the JHU–ISI Gesture and Skill Assessment Working Set dataset—co-developed by the Johns Hopkins University and Intuitive Surgical Inc. Using this well-established set gives us the opportunity to comparatively evaluate different feature extraction techniques. The algorithm reached up to 95.74% accuracy in individual trials. The highest mean accuracy—averaged over five cross-validation trials—for the surgical subtask of Knot-Tying was 83.54%, for Needle-Passing 84.23% and for Suturing 81.58%. The proposed method measured well against the state of the art in 2D visual-based skill assessment, with more than 80% accuracy for all three surgical subtasks available in JIGSAWS (Knot-Tying, Suturing and Needle-Passing). By introducing new visual features—such as image-based orientation and image-based collision detection—or, from the evaluation side, utilising other Support Vector Machine kernel methods, tuning the hyperparameters or using other classification methods (e.g., the boosted trees algorithm) instead, classification accuracy can be further improved. We showed the potential use of optical flow as an input for RAMIS skill assessment, highlighting the maximum accuracy achievable with these data by evaluating it with an established skill assessment benchmark, by evaluating its methods independently. The highest performing method, the Residual Neural Network, reached means of 81.89%, 84.23% and 83.54% accuracy for the skills of Suturing, Needle-Passing and Knot-Tying, respectively.
No abstract
The recent advancements in the surging field of Deep Learning (DL) have revolutionized every sphere of life, and the healthcare domain is no exception. The enormous success of DL models, particularly with image data, has led to the development of image-guided Robot Assisted Surgery (RAS) systems. By and large, the number of studies concerning image-driven computer assisted surgical systems using DL has increased exponentially. Additionally, the contemporary availability of surgical datasets has also boosted the DL applications in RAS. Inspired by the latest trends and contributions in surgery, this literature survey presents a summarized analysis of recent innovations of DL in image-guided RAS systems. After a thorough review, a sum of 184 articles are selected and grouped into four categories, based on the literature and the relevancy of the task in the articles, comprising 1) Surgical Tools, 2) Surgical Processes, 3) Surgical Surveillance, and 4) Surgical Performance. The survey also discusses publicly available surgical datasets and highlights the basics of the DL models. Furthermore, the legal, ethical, and technological challenges together with the intuitive predictions and recommendations related to the autonomous RAS systems are also presented. The study reveals that Convolutional Neural Network (CNN) is most widely adopted architecture, whereas, JIGSAWS is most employed dataset in RAS. The study suggests fusing kinematic data along with image data, which produces better accuracy and precision, particularly in gesture and trajectory segmentation tasks. Additionally, CNN and Long Short Term Memory (LSTM) networks have shown remarkable performance, however, authors recommend employing these gigantic architectures only when simpler models have failed to produce satisfactory results. The simpler models, despite their limitations, are time and cost effective and yield considerable outcomes even on the smaller datasets.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.