Video understanding using multimodal deep learning

Video understanding using multimodal deep learning

<p>Our experience of the world is multimodal, however deep learning networks have been traditionally designed for and trained on unimodal inputs such as images, audio segments or text. In this thesis we develop strategies to exploit multimodal information (in the form of vision, text, speech a...

Description complète

Détails bibliographiques
Auteur principal:	Nagrani, A
Autres auteurs:	Zisserman, A
Format:	Thèse
Langue:	English
Publié:	2020
Sujets:	Computer Vision Machine Learning

Documents similaires

Sign language understanding using multimodal learning
par: Momeni, L
Publié: (2024)

Understanding Multimodal Popularity Prediction of Social Media Videos With Self-Attention
par: Adam Bielski, et autres
Publié: (2018-01-01)

End-to-end learning, and audio-visual human-centric video understanding
par: Brown, A
Publié: (2022)

Holistic image understanding with deep learning and dense random fields
par: Zheng, S
Publié: (2016)

Learning with multimodal self-supervision
par: Chen, H
Publié: (2021)

Self-supervised video representation learning
par: Han, T
Publié: (2022)

Self-supervised and cross-modal learning from videos
par: Koepke, AS
Publié: (2019)

Deep vision for indoor understanding and localisation
par: Howard-Jenkins, H
Publié: (2022)

Understanding video through the lens of language
par: Bain, M
Publié: (2023)

Pixel-level scene understanding with deep structured models
par: Arnab, A
Publié: (2019)

Deep Vision Multimodal Learning: Methodology, Benchmark, and Trend
par: Wenhao Chai, et autres
Publié: (2022-06-01)

Looking deep at people: towards understanding and generating humans in images with deep learning
par: de Bem, RA
Publié: (2018)

Learning to understand large-scale 3D point clouds
par: Qingyong, H
Publié: (2022)

Self-supervised learning using motion and visualizing convolutional neural networks
par: Mahendran, A
Publié: (2018)

Visual recognition in art using machine learning
par: Crowley, E
Publié: (2017)

DEEP LEARNING-BASED MODEL FOR CLASSIFICATION OF BEAN NITROGEN STATUS USING DIGITAL CANOPY IMAGING
par: Murilo M. Baesso, et autres
Publié: (2023-06-01)

Seeing What You’re Told: Sentence-Guided Activity Recognition In Video
par: Siddharth, Narayanaswamy, et autres
Publié: (2015)

An Interpretable Deep Learning-Based Feature Reduction in Video-Based Human Activity Recognition
par: Micheal Dutt, et autres
Publié: (2024-01-01)

A Survey on Audio-Video Based Defect Detection Through Deep Learning in Railway Maintenance
par: Lorenzo De Donato, et autres
Publié: (2022-01-01)

On the Generalization of Deep Learning Models in Video Deepfake Detection
par: Davide Alessandro Coccomini, et autres
Publié: (2023-04-01)

Automatic Detection for Acromegaly Using Hand Photographs: A Deep-Learning Approach
par: Chengbin Duan, et autres
Publié: (2021-01-01)

HyMNet: A Multimodal Deep Learning System for Hypertension Prediction Using Fundus Images and Cardiometabolic Risk Factors
par: Mohammed Baharoon, et autres
Publié: (2024-10-01)

Corrigendum: Deep Plant Phenomics: A Deep Learning Platform for Complex Plant Phenotyping Tasks
par: Jordan R. Ubbens, et autres
Publié: (2018-01-01)

Scalable learning for expanding robot vision
par: Porav, H
Publié: (2020)

Robust 2D and 3D registration with deep neural networks
par: Wang, Z
Publié: (2024)

Learning shape from images
par: Wiles, O
Publié: (2020)

Unsupervised learning of clutter-resistant visual representations from natural videos
par: Liao, Qianli, et autres
Publié: (2015)

Challenges and Applications for Implementing Machine Learning in Computer Vision /
par: Kashyap, Ramgopal, 1984- editor., et autres
Publié: ([202)

Understanding Mixup Training Methods
par: Daojun Liang, et autres
Publié: (2018-01-01)

Multimodal Image-Based Indoor Localization with Machine Learning—A Systematic Review
par: Szymon Łukasik, et autres
Publié: (2024-09-01)

Structured learning and prediction in computer vision /
par: 525432 Nowozin, Sebastian, et autres
Publié: (2011)

A Dataset of apical periodontitis lesions in panoramic radiographs for deep-learning-based classification and detection
par: Hoang Viet Do, et autres
Publié: (2024-06-01)

Use and examination of convolutional neural networks for scene understanding
par: Jetley, S
Publié: (2018)

Deep Learning Architecture Reduction for fMRI Data
par: Ruben Alvarez-Gonzalez, et autres
Publié: (2022-02-01)

Unsupervised learning of 3d objects in the wild
par: Wu, S
Publié: (2022)

Deep learning based computer vision approaches for smart agricultural applications
par: V.G. Dhanya, et autres
Publié: (2022-01-01)

Classification of protected grassland habitats using deep learning architectures on Sentinel-2 satellite imagery data
par: Gabriel Díaz-Ireland, et autres
Publié: (2024-11-01)

Computer vision and machine learning with RGB-D sensors /
par: Shao, Ling
Publié: (c201)

Multimodal Deep Learning Integration of Image, Weather, and Phenotypic Data Under Temporal Effects for Early Prediction of Maize Yield
par: Danial Shamsuddin, et autres
Publié: (2024-10-01)

Weakly-supervised learning for video understanding
par: Deng, Dingfan
Publié: (2023)