Summary: | <p>Obstetric ultrasound scanning is a safe and effective tool for the early detection of fetal abnormalities and therefore crucial for determining the necessity of clinical intervention. However, ultrasound relies on operator expertise, which is a scarce resource globally. Moreover, there are large geographical and inter-observer variations of clinical outcomes. To address this, the PULSE project aims to develop a new generation of ultrasound scanning capabilities based on big data and machine learning models which capture the knowledge of experienced sonographers. To this end, the project team acquired a first-of-its-kind large-scale dataset of routine clinical ultrasound scanning with gaze-tracking data.</p>
<p>In this thesis, we first examine shortcomings of the operator-machine interaction. We find that sonographers adjust the biometric measurements of fetuses with potential growth abnormalities towards the healthy expected value, providing a possible explanation for the known deficiencies of these measurements. Moreover, we study the adherence to safety recommendations regarding thermal energy emission and find that, while sonographers keep within the appropriate limits, they rarely check the safety indices. We provide suggestions for the modification of the ultrasound machine interface to address these two issues.</p>
<p>Second, we develop the first model that predicts sonographer gaze-tracking data on ultrasound video through the method of visual saliency prediction. In addition, we propose the first unified visual saliency model for the prediction of gaze on both images and videos. Besides unifying the two modalities, the model obtains state-of-the-art performance on both tasks for all relevant computer vision benchmarks.</p>
<p>Third, we show that sonographer gaze-tracking data is a powerful supervision signal for ultrasound image feature representation learning. We develop a general framework for representation learning and transfer of the trained neural network to the downstream tasks of standard plane detection and automatic biometry plane annotation. We also show that the learned representations, in combination with the sonographer gaze prediction, can be used to discover and localize visually salient anatomical landmarks, i.e., landmarks that sonographers use for visual navigation.</p>
<p>Finally, we provide an overarching discussion and an extended outlook chapter which describes a system for guiding sonographers during standard plane acquisition.</p>
|