Video understanding using multimodal deep learning
<p>Our experience of the world is multimodal, however deep learning networks have been traditionally designed for and trained on unimodal inputs such as images, audio segments or text. In this thesis we develop strategies to exploit multimodal information (in the form of vision, text, speech a...
Үндсэн зохиолч: | Nagrani, A |
---|---|
Бусад зохиолчид: | Zisserman, A |
Формат: | Дипломын ажил |
Хэл сонгох: | English |
Хэвлэсэн: |
2020
|
Нөхцлүүд: |
Ижил төстэй зүйлс
Ижил төстэй зүйлс
-
Sign language understanding using multimodal learning
-н: Momeni, L
Хэвлэсэн: (2024) -
Understanding Multimodal Popularity Prediction of Social Media Videos With Self-Attention
-н: Adam Bielski, зэрэг
Хэвлэсэн: (2018-01-01) -
End-to-end learning, and audio-visual human-centric video understanding
-н: Brown, A
Хэвлэсэн: (2022) -
Holistic image understanding with deep learning and dense random fields
-н: Zheng, S
Хэвлэсэн: (2016) -
Learning with multimodal self-supervision
-н: Chen, H
Хэвлэсэн: (2021)