Multi-Stage Temporal Convolutional Network with Moment Loss and Positional Encoding for Surgical Phase Recognition

In recent times, many studies concerning surgical video analysis are being conducted due to its growing importance in many medical applications. In particular, it is very important to be able to recognize the current surgical phase because the phase information can be utilized in various ways both d...

وصف كامل

التفاصيل البيبلوغرافية
المؤلفون الرئيسيون: Minyoung Park, Seungtaek Oh, Taikyeong Jeong, Sungwook Yu
التنسيق: مقال
اللغة:English
منشور في: MDPI AG 2022-12-01
سلاسل:Diagnostics
الموضوعات:
الوصول للمادة أونلاين:https://www.mdpi.com/2075-4418/13/1/107
_version_ 1827760838789824512
author Minyoung Park
Seungtaek Oh
Taikyeong Jeong
Sungwook Yu
author_facet Minyoung Park
Seungtaek Oh
Taikyeong Jeong
Sungwook Yu
author_sort Minyoung Park
collection DOAJ
description In recent times, many studies concerning surgical video analysis are being conducted due to its growing importance in many medical applications. In particular, it is very important to be able to recognize the current surgical phase because the phase information can be utilized in various ways both during and after surgery. This paper proposes an efficient phase recognition network, called MomentNet, for cholecystectomy endoscopic videos. Unlike LSTM-based network, MomentNet is based on a multi-stage temporal convolutional network. Besides, to improve the phase prediction accuracy, the proposed method adopts a new loss function to supplement the general cross entropy loss function. The new loss function significantly improves the performance of the phase recognition network by constraining un-desirable phase transition and preventing over-segmentation. In addition, MomnetNet effectively applies positional encoding techniques, which are commonly applied in transformer architectures, to the multi-stage temporal convolution network. By using the positional encoding techniques, MomentNet can provide important temporal context, resulting in higher phase prediction accuracy. Furthermore, the MomentNet applies label smoothing technique to suppress overfitting and replaces the backbone network for feature extraction to further improve the network performance. As a result, the MomentNet achieves 92.31% accuracy in the phase recognition task with the Cholec80 dataset, which is 4.55% higher than that of the baseline architecture.
first_indexed 2024-03-11T10:04:48Z
format Article
id doaj.art-46e10a71b40842a4997648e4752725c3
institution Directory Open Access Journal
issn 2075-4418
language English
last_indexed 2024-03-11T10:04:48Z
publishDate 2022-12-01
publisher MDPI AG
record_format Article
series Diagnostics
spelling doaj.art-46e10a71b40842a4997648e4752725c32023-11-16T15:08:56ZengMDPI AGDiagnostics2075-44182022-12-0113110710.3390/diagnostics13010107Multi-Stage Temporal Convolutional Network with Moment Loss and Positional Encoding for Surgical Phase RecognitionMinyoung Park0Seungtaek Oh1Taikyeong Jeong2Sungwook Yu3School of Electrical and Electronics Engineering, Chung-Ang University, 84 Heukseok-ro, Dongjak-gu, Seoul 06974, Republic of KoreaSchool of Electrical and Electronics Engineering, Chung-Ang University, 84 Heukseok-ro, Dongjak-gu, Seoul 06974, Republic of KoreaSchool of Artificial Intelligence Convergence, Hallym University, Chuncheon 24252, Republic of KoreaSchool of Electrical and Electronics Engineering, Chung-Ang University, 84 Heukseok-ro, Dongjak-gu, Seoul 06974, Republic of KoreaIn recent times, many studies concerning surgical video analysis are being conducted due to its growing importance in many medical applications. In particular, it is very important to be able to recognize the current surgical phase because the phase information can be utilized in various ways both during and after surgery. This paper proposes an efficient phase recognition network, called MomentNet, for cholecystectomy endoscopic videos. Unlike LSTM-based network, MomentNet is based on a multi-stage temporal convolutional network. Besides, to improve the phase prediction accuracy, the proposed method adopts a new loss function to supplement the general cross entropy loss function. The new loss function significantly improves the performance of the phase recognition network by constraining un-desirable phase transition and preventing over-segmentation. In addition, MomnetNet effectively applies positional encoding techniques, which are commonly applied in transformer architectures, to the multi-stage temporal convolution network. By using the positional encoding techniques, MomentNet can provide important temporal context, resulting in higher phase prediction accuracy. Furthermore, the MomentNet applies label smoothing technique to suppress overfitting and replaces the backbone network for feature extraction to further improve the network performance. As a result, the MomentNet achieves 92.31% accuracy in the phase recognition task with the Cholec80 dataset, which is 4.55% higher than that of the baseline architecture.https://www.mdpi.com/2075-4418/13/1/107surgical phase recognitionCholec80moment losspositional encodinglabel smoothingEfficientNet
spellingShingle Minyoung Park
Seungtaek Oh
Taikyeong Jeong
Sungwook Yu
Multi-Stage Temporal Convolutional Network with Moment Loss and Positional Encoding for Surgical Phase Recognition
Diagnostics
surgical phase recognition
Cholec80
moment loss
positional encoding
label smoothing
EfficientNet
title Multi-Stage Temporal Convolutional Network with Moment Loss and Positional Encoding for Surgical Phase Recognition
title_full Multi-Stage Temporal Convolutional Network with Moment Loss and Positional Encoding for Surgical Phase Recognition
title_fullStr Multi-Stage Temporal Convolutional Network with Moment Loss and Positional Encoding for Surgical Phase Recognition
title_full_unstemmed Multi-Stage Temporal Convolutional Network with Moment Loss and Positional Encoding for Surgical Phase Recognition
title_short Multi-Stage Temporal Convolutional Network with Moment Loss and Positional Encoding for Surgical Phase Recognition
title_sort multi stage temporal convolutional network with moment loss and positional encoding for surgical phase recognition
topic surgical phase recognition
Cholec80
moment loss
positional encoding
label smoothing
EfficientNet
url https://www.mdpi.com/2075-4418/13/1/107
work_keys_str_mv AT minyoungpark multistagetemporalconvolutionalnetworkwithmomentlossandpositionalencodingforsurgicalphaserecognition
AT seungtaekoh multistagetemporalconvolutionalnetworkwithmomentlossandpositionalencodingforsurgicalphaserecognition
AT taikyeongjeong multistagetemporalconvolutionalnetworkwithmomentlossandpositionalencodingforsurgicalphaserecognition
AT sungwookyu multistagetemporalconvolutionalnetworkwithmomentlossandpositionalencodingforsurgicalphaserecognition