Audio-visual modelling in a clinical setting
Auditory and visual signals are two primary perception modalities that are usually present together and correlate with each other, not only in natural environments but also in clinical settings. However, audio-visual modelling in the latter case can be more challenging, due to the different sources...
Main Authors: | Jiao, J, Alsharid, M, Drukker, L, Papageorghiou, AT, Zisserman, A, Noble, JA |
---|---|
Format: | Journal article |
Language: | English |
Published: |
Nature Research
2024
|
Similar Items
-
Self-supervised contrastive video-speech representation learning for ultrasound
by: Jiao, J, et al.
Published: (2020) -
A picture is worth 1000 words: textual analysis of routine 20-week scan
by: Alsharid, M, et al.
Published: (2022) -
Towards scale and position invariant task classification using normalised visual scanpaths in clinical fetal ultrasound
by: Teng, C, et al.
Published: (2021) -
Captioning ultrasound images automatically
by: Alsharid, M, et al.
Published: (2019) -
Gaze-assisted automatic captioning of fetal ultrasound videos using three-way multi-modal deep neural networks
by: Alsharid, M, et al.
Published: (2022)