Laparoscopic Video Analysis Using Temporal, Attention, and Multi-Feature Fusion Based-Approaches

Adapting intelligent context-aware systems (CAS) to future operating rooms (OR) aims to improve situational awareness and provide surgical decision support systems to medical teams. CAS analyzes data streams from available devices during surgery and communicates real-time knowledge to clinicians. In...

Full description

Bibliographic Details
Main Authors:	Nour Aldeen Jalal, Tamer Abdulbaki Alshirbaji, Paul David Docherty, Herag Arabian, Bernhard Laufer, Sabine Krueger-Ziolek, Thomas Neumuth, Knut Moeller
Format:	Article
Language:	English
Published:	MDPI AG 2023-02-01
Series:	Sensors
Subjects:	context-aware system laparoscopic video analysis surgical phase recognition surgical tool classification surgical tool localization
Online Access:	https://www.mdpi.com/1424-8220/23/4/1958

_version_	1797618282127163392
author	Nour Aldeen Jalal Tamer Abdulbaki Alshirbaji Paul David Docherty Herag Arabian Bernhard Laufer Sabine Krueger-Ziolek Thomas Neumuth Knut Moeller
author_facet	Nour Aldeen Jalal Tamer Abdulbaki Alshirbaji Paul David Docherty Herag Arabian Bernhard Laufer Sabine Krueger-Ziolek Thomas Neumuth Knut Moeller
author_sort	Nour Aldeen Jalal
collection	DOAJ
description	Adapting intelligent context-aware systems (CAS) to future operating rooms (OR) aims to improve situational awareness and provide surgical decision support systems to medical teams. CAS analyzes data streams from available devices during surgery and communicates real-time knowledge to clinicians. Indeed, recent advances in computer vision and machine learning, particularly deep learning, paved the way for extensive research to develop CAS. In this work, a deep learning approach for analyzing laparoscopic videos for surgical phase recognition, tool classification, and weakly-supervised tool localization in laparoscopic videos was proposed. The ResNet-50 convolutional neural network (CNN) architecture was adapted by adding attention modules and fusing features from multiple stages to generate better-focused, generalized, and well-representative features. Then, a multi-map convolutional layer followed by tool-wise and spatial pooling operations was utilized to perform tool localization and generate tool presence confidences. Finally, the long short-term memory (LSTM) network was employed to model temporal information and perform tool classification and phase recognition. The proposed approach was evaluated on the Cholec80 dataset. The experimental results (i.e., 88.5% and 89.0% mean precision and recall for phase recognition, respectively, 95.6% mean average precision for tool presence detection, and a 70.1% F1-score for tool localization) demonstrated the ability of the model to learn discriminative features for all tasks. The performances revealed the importance of integrating attention modules and multi-stage feature fusion for more robust and precise detection of surgical phases and tools.
first_indexed	2024-03-11T08:10:52Z
format	Article
id	doaj.art-6fb36159dc204834879b174cee4e9fad
institution	Directory Open Access Journal
issn	1424-8220
language	English
last_indexed	2024-03-11T08:10:52Z
publishDate	2023-02-01
publisher	MDPI AG
record_format	Article
series	Sensors
spelling	doaj.art-6fb36159dc204834879b174cee4e9fad2023-11-16T23:08:24ZengMDPI AGSensors1424-82202023-02-01234195810.3390/s23041958Laparoscopic Video Analysis Using Temporal, Attention, and Multi-Feature Fusion Based-ApproachesNour Aldeen Jalal0Tamer Abdulbaki Alshirbaji1Paul David Docherty2Herag Arabian3Bernhard Laufer4Sabine Krueger-Ziolek5Thomas Neumuth6Knut Moeller7Institute of Technical Medicine (ITeM), Furtwangen University, 78054 Villingen-Schwenningen, GermanyInstitute of Technical Medicine (ITeM), Furtwangen University, 78054 Villingen-Schwenningen, GermanyInstitute of Technical Medicine (ITeM), Furtwangen University, 78054 Villingen-Schwenningen, GermanyInstitute of Technical Medicine (ITeM), Furtwangen University, 78054 Villingen-Schwenningen, GermanyInstitute of Technical Medicine (ITeM), Furtwangen University, 78054 Villingen-Schwenningen, GermanyInstitute of Technical Medicine (ITeM), Furtwangen University, 78054 Villingen-Schwenningen, GermanyInnovation Center Computer Assisted Surgery (ICCAS), University of Leipzig, 04103 Leipzig, GermanyInstitute of Technical Medicine (ITeM), Furtwangen University, 78054 Villingen-Schwenningen, GermanyAdapting intelligent context-aware systems (CAS) to future operating rooms (OR) aims to improve situational awareness and provide surgical decision support systems to medical teams. CAS analyzes data streams from available devices during surgery and communicates real-time knowledge to clinicians. Indeed, recent advances in computer vision and machine learning, particularly deep learning, paved the way for extensive research to develop CAS. In this work, a deep learning approach for analyzing laparoscopic videos for surgical phase recognition, tool classification, and weakly-supervised tool localization in laparoscopic videos was proposed. The ResNet-50 convolutional neural network (CNN) architecture was adapted by adding attention modules and fusing features from multiple stages to generate better-focused, generalized, and well-representative features. Then, a multi-map convolutional layer followed by tool-wise and spatial pooling operations was utilized to perform tool localization and generate tool presence confidences. Finally, the long short-term memory (LSTM) network was employed to model temporal information and perform tool classification and phase recognition. The proposed approach was evaluated on the Cholec80 dataset. The experimental results (i.e., 88.5% and 89.0% mean precision and recall for phase recognition, respectively, 95.6% mean average precision for tool presence detection, and a 70.1% F1-score for tool localization) demonstrated the ability of the model to learn discriminative features for all tasks. The performances revealed the importance of integrating attention modules and multi-stage feature fusion for more robust and precise detection of surgical phases and tools.https://www.mdpi.com/1424-8220/23/4/1958context-aware systemlaparoscopic video analysissurgical phase recognitionsurgical tool classificationsurgical tool localization
spellingShingle	Nour Aldeen Jalal Tamer Abdulbaki Alshirbaji Paul David Docherty Herag Arabian Bernhard Laufer Sabine Krueger-Ziolek Thomas Neumuth Knut Moeller Laparoscopic Video Analysis Using Temporal, Attention, and Multi-Feature Fusion Based-Approaches Sensors context-aware system laparoscopic video analysis surgical phase recognition surgical tool classification surgical tool localization
title	Laparoscopic Video Analysis Using Temporal, Attention, and Multi-Feature Fusion Based-Approaches
title_full	Laparoscopic Video Analysis Using Temporal, Attention, and Multi-Feature Fusion Based-Approaches
title_fullStr	Laparoscopic Video Analysis Using Temporal, Attention, and Multi-Feature Fusion Based-Approaches
title_full_unstemmed	Laparoscopic Video Analysis Using Temporal, Attention, and Multi-Feature Fusion Based-Approaches
title_short	Laparoscopic Video Analysis Using Temporal, Attention, and Multi-Feature Fusion Based-Approaches
title_sort	laparoscopic video analysis using temporal attention and multi feature fusion based approaches
topic	context-aware system laparoscopic video analysis surgical phase recognition surgical tool classification surgical tool localization
url	https://www.mdpi.com/1424-8220/23/4/1958
work_keys_str_mv	AT nouraldeenjalal laparoscopicvideoanalysisusingtemporalattentionandmultifeaturefusionbasedapproaches AT tamerabdulbakialshirbaji laparoscopicvideoanalysisusingtemporalattentionandmultifeaturefusionbasedapproaches AT pauldaviddocherty laparoscopicvideoanalysisusingtemporalattentionandmultifeaturefusionbasedapproaches AT heragarabian laparoscopicvideoanalysisusingtemporalattentionandmultifeaturefusionbasedapproaches AT bernhardlaufer laparoscopicvideoanalysisusingtemporalattentionandmultifeaturefusionbasedapproaches AT sabinekruegerziolek laparoscopicvideoanalysisusingtemporalattentionandmultifeaturefusionbasedapproaches AT thomasneumuth laparoscopicvideoanalysisusingtemporalattentionandmultifeaturefusionbasedapproaches AT knutmoeller laparoscopicvideoanalysisusingtemporalattentionandmultifeaturefusionbasedapproaches

Laparoscopic Video Analysis Using Temporal, Attention, and Multi-Feature Fusion Based-Approaches

Similar Items