Gaze-assisted automatic captioning of fetal ultrasound videos using three-way multi-modal deep neural networks
In this work, we present a novel gaze-assisted natural language processing (NLP)-based video captioning model to describe routine second-trimester fetal ultrasound scan videos in a vocabulary of spoken sonography. The primary novelty of our multi-modal approach is that the learned video captioning m...
Main Authors: | , , , , , |
---|---|
格式: | Journal article |
语言: | English |
出版: |
Elsevier
2022
|