Video understanding using multimodal deep learning

Video understanding using multimodal deep learning

<p>Our experience of the world is multimodal, however deep learning networks have been traditionally designed for and trained on unimodal inputs such as images, audio segments or text. In this thesis we develop strategies to exploit multimodal information (in the form of vision, text, speech a...

ver descrição completa

Detalhes bibliográficos
Autor principal:	Nagrani, A
Outros Autores:	Zisserman, A
Formato:	Thesis
Idioma:	English
Publicado em:	2020
Assuntos:	Computer Vision Machine Learning

Registos relacionados

Sign language understanding using multimodal learning
Por: Momeni, L
Publicado em: (2024)

Understanding Multimodal Popularity Prediction of Social Media Videos With Self-Attention
Por: Adam Bielski, et al.
Publicado em: (2018-01-01)

End-to-end learning, and audio-visual human-centric video understanding
Por: Brown, A
Publicado em: (2022)

Holistic image understanding with deep learning and dense random fields
Por: Zheng, S
Publicado em: (2016)

Learning with multimodal self-supervision
Por: Chen, H
Publicado em: (2021)

Self-supervised video representation learning
Por: Han, T
Publicado em: (2022)

Self-supervised and cross-modal learning from videos
Por: Koepke, AS
Publicado em: (2019)

Deep vision for indoor understanding and localisation
Por: Howard-Jenkins, H
Publicado em: (2022)

Understanding video through the lens of language
Por: Bain, M
Publicado em: (2023)

Pixel-level scene understanding with deep structured models
Por: Arnab, A
Publicado em: (2019)

Deep Vision Multimodal Learning: Methodology, Benchmark, and Trend
Por: Wenhao Chai, et al.
Publicado em: (2022-06-01)

Looking deep at people: towards understanding and generating humans in images with deep learning
Por: de Bem, RA
Publicado em: (2018)

Learning to understand large-scale 3D point clouds
Por: Qingyong, H
Publicado em: (2022)

Self-supervised learning using motion and visualizing convolutional neural networks
Por: Mahendran, A
Publicado em: (2018)

Visual recognition in art using machine learning
Por: Crowley, E
Publicado em: (2017)

DEEP LEARNING-BASED MODEL FOR CLASSIFICATION OF BEAN NITROGEN STATUS USING DIGITAL CANOPY IMAGING
Por: Murilo M. Baesso, et al.
Publicado em: (2023-06-01)

Seeing What You’re Told: Sentence-Guided Activity Recognition In Video
Por: Siddharth, Narayanaswamy, et al.
Publicado em: (2015)

An Interpretable Deep Learning-Based Feature Reduction in Video-Based Human Activity Recognition
Por: Micheal Dutt, et al.
Publicado em: (2024-01-01)

A Survey on Audio-Video Based Defect Detection Through Deep Learning in Railway Maintenance
Por: Lorenzo De Donato, et al.
Publicado em: (2022-01-01)

On the Generalization of Deep Learning Models in Video Deepfake Detection
Por: Davide Alessandro Coccomini, et al.
Publicado em: (2023-04-01)

Automatic Detection for Acromegaly Using Hand Photographs: A Deep-Learning Approach
Por: Chengbin Duan, et al.
Publicado em: (2021-01-01)

HyMNet: A Multimodal Deep Learning System for Hypertension Prediction Using Fundus Images and Cardiometabolic Risk Factors
Por: Mohammed Baharoon, et al.
Publicado em: (2024-10-01)

Corrigendum: Deep Plant Phenomics: A Deep Learning Platform for Complex Plant Phenotyping Tasks
Por: Jordan R. Ubbens, et al.
Publicado em: (2018-01-01)

Scalable learning for expanding robot vision
Por: Porav, H
Publicado em: (2020)

Robust 2D and 3D registration with deep neural networks
Por: Wang, Z
Publicado em: (2024)

Learning shape from images
Por: Wiles, O
Publicado em: (2020)

Unsupervised learning of clutter-resistant visual representations from natural videos
Por: Liao, Qianli, et al.
Publicado em: (2015)

Challenges and Applications for Implementing Machine Learning in Computer Vision /
Por: Kashyap, Ramgopal, 1984- editor., et al.
Publicado em: ([202)

Understanding Mixup Training Methods
Por: Daojun Liang, et al.
Publicado em: (2018-01-01)

Multimodal Image-Based Indoor Localization with Machine Learning—A Systematic Review
Por: Szymon Łukasik, et al.
Publicado em: (2024-09-01)

Structured learning and prediction in computer vision /
Por: 525432 Nowozin, Sebastian, et al.
Publicado em: (2011)

A Dataset of apical periodontitis lesions in panoramic radiographs for deep-learning-based classification and detection
Por: Hoang Viet Do, et al.
Publicado em: (2024-06-01)

Use and examination of convolutional neural networks for scene understanding
Por: Jetley, S
Publicado em: (2018)

Deep Learning Architecture Reduction for fMRI Data
Por: Ruben Alvarez-Gonzalez, et al.
Publicado em: (2022-02-01)

Unsupervised learning of 3d objects in the wild
Por: Wu, S
Publicado em: (2022)

Deep learning based computer vision approaches for smart agricultural applications
Por: V.G. Dhanya, et al.
Publicado em: (2022-01-01)

Classification of protected grassland habitats using deep learning architectures on Sentinel-2 satellite imagery data
Por: Gabriel Díaz-Ireland, et al.
Publicado em: (2024-11-01)

Computer vision and machine learning with RGB-D sensors /
Por: Shao, Ling
Publicado em: (c201)

Multimodal Deep Learning Integration of Image, Weather, and Phenotypic Data Under Temporal Effects for Early Prediction of Maize Yield
Por: Danial Shamsuddin, et al.
Publicado em: (2024-10-01)

Weakly-supervised learning for video understanding
Por: Deng, Dingfan
Publicado em: (2023)