Multi-modal learning from video, eye tracking, and pupillometry for operator skill characterization in clinical fetal ultrasound

This paper presents a novel multi-modal learning approach for automated skill characterization of obstetric ultrasound operators using heterogeneous spatio-temporal sensory cues, namely, scan video, eye-tracking data, and pupillometric data, acquired in the clinical environment. We address pertinent...

Full description

Bibliographic Details
Main Authors:	Sharma, H, Drukker, L, Papageorghiou, AT, Noble, JA
Format:	Conference item
Language:	English
Published:	IEEE 2021

_version_	1826307871683379200
author	Sharma, H Drukker, L Papageorghiou, AT Noble, JA
author_facet	Sharma, H Drukker, L Papageorghiou, AT Noble, JA
author_sort	Sharma, H
collection	OXFORD
description	This paper presents a novel multi-modal learning approach for automated skill characterization of obstetric ultrasound operators using heterogeneous spatio-temporal sensory cues, namely, scan video, eye-tracking data, and pupillometric data, acquired in the clinical environment. We address pertinent challenges such as combining heterogeneous, small-scale and variable-length sequential datasets, to learn deep convolutional neural networks in real-world scenarios. We propose spatial encoding for multi-modal analysis using sonography standard plane images, spatial gaze maps, gaze trajectory images, and pupillary response images. We present and compare five multi-modal learning network architectures using late, intermediate, hybrid, and tensor fusion. We build models for the Heart and the Brain scanning tasks, and performance evaluation suggests that multi-modal learning networks outperform uni-modal networks, with the best-performing model achieving accuracies of 82.4% (Brain task) and 76.4% (Heart task) for the operator skill classification problem.
first_indexed	2024-03-07T07:09:31Z
format	Conference item
id	oxford-uuid:a565c7b1-1705-4783-bf13-eeac06fcc86c
institution	University of Oxford
language	English
last_indexed	2024-03-07T07:09:31Z
publishDate	2021
publisher	IEEE
record_format	dspace
spelling	oxford-uuid:a565c7b1-1705-4783-bf13-eeac06fcc86c2022-06-06T13:12:40ZMulti-modal learning from video, eye tracking, and pupillometry for operator skill characterization in clinical fetal ultrasoundConference itemhttp://purl.org/coar/resource_type/c_5794uuid:a565c7b1-1705-4783-bf13-eeac06fcc86cEnglishSymplectic ElementsIEEE2021Sharma, HDrukker, LPapageorghiou, ATNoble, JAThis paper presents a novel multi-modal learning approach for automated skill characterization of obstetric ultrasound operators using heterogeneous spatio-temporal sensory cues, namely, scan video, eye-tracking data, and pupillometric data, acquired in the clinical environment. We address pertinent challenges such as combining heterogeneous, small-scale and variable-length sequential datasets, to learn deep convolutional neural networks in real-world scenarios. We propose spatial encoding for multi-modal analysis using sonography standard plane images, spatial gaze maps, gaze trajectory images, and pupillary response images. We present and compare five multi-modal learning network architectures using late, intermediate, hybrid, and tensor fusion. We build models for the Heart and the Brain scanning tasks, and performance evaluation suggests that multi-modal learning networks outperform uni-modal networks, with the best-performing model achieving accuracies of 82.4% (Brain task) and 76.4% (Heart task) for the operator skill classification problem.
spellingShingle	Sharma, H Drukker, L Papageorghiou, AT Noble, JA Multi-modal learning from video, eye tracking, and pupillometry for operator skill characterization in clinical fetal ultrasound
title	Multi-modal learning from video, eye tracking, and pupillometry for operator skill characterization in clinical fetal ultrasound
title_full	Multi-modal learning from video, eye tracking, and pupillometry for operator skill characterization in clinical fetal ultrasound
title_fullStr	Multi-modal learning from video, eye tracking, and pupillometry for operator skill characterization in clinical fetal ultrasound
title_full_unstemmed	Multi-modal learning from video, eye tracking, and pupillometry for operator skill characterization in clinical fetal ultrasound
title_short	Multi-modal learning from video, eye tracking, and pupillometry for operator skill characterization in clinical fetal ultrasound
title_sort	multi modal learning from video eye tracking and pupillometry for operator skill characterization in clinical fetal ultrasound
work_keys_str_mv	AT sharmah multimodallearningfromvideoeyetrackingandpupillometryforoperatorskillcharacterizationinclinicalfetalultrasound AT drukkerl multimodallearningfromvideoeyetrackingandpupillometryforoperatorskillcharacterizationinclinicalfetalultrasound AT papageorghiouat multimodallearningfromvideoeyetrackingandpupillometryforoperatorskillcharacterizationinclinicalfetalultrasound AT nobleja multimodallearningfromvideoeyetrackingandpupillometryforoperatorskillcharacterizationinclinicalfetalultrasound

Multi-modal learning from video, eye tracking, and pupillometry for operator skill characterization in clinical fetal ultrasound

Similar Items