Video understanding using multimodal deep learning

Video understanding using multimodal deep learning

<p>Our experience of the world is multimodal, however deep learning networks have been traditionally designed for and trained on unimodal inputs such as images, audio segments or text. In this thesis we develop strategies to exploit multimodal information (in the form of vision, text, speech a...

Mô tả đầy đủ

Chi tiết về thư mục
Tác giả chính:	Nagrani, A
Tác giả khác:	Zisserman, A
Định dạng:	Luận văn
Ngôn ngữ:	English
Được phát hành:	2020
Những chủ đề:	Computer Vision Machine Learning

Những quyển sách tương tự

Sign language understanding using multimodal learning
Bằng: Momeni, L
Được phát hành: (2024)

Understanding Multimodal Popularity Prediction of Social Media Videos With Self-Attention
Bằng: Adam Bielski, et al.
Được phát hành: (2018-01-01)

End-to-end learning, and audio-visual human-centric video understanding
Bằng: Brown, A
Được phát hành: (2022)

Holistic image understanding with deep learning and dense random fields
Bằng: Zheng, S
Được phát hành: (2016)

Learning with multimodal self-supervision
Bằng: Chen, H
Được phát hành: (2021)

Self-supervised video representation learning
Bằng: Han, T
Được phát hành: (2022)

Self-supervised and cross-modal learning from videos
Bằng: Koepke, AS
Được phát hành: (2019)

Deep vision for indoor understanding and localisation
Bằng: Howard-Jenkins, H
Được phát hành: (2022)

Understanding video through the lens of language
Bằng: Bain, M
Được phát hành: (2023)

Pixel-level scene understanding with deep structured models
Bằng: Arnab, A
Được phát hành: (2019)

Deep Vision Multimodal Learning: Methodology, Benchmark, and Trend
Bằng: Wenhao Chai, et al.
Được phát hành: (2022-06-01)

Looking deep at people: towards understanding and generating humans in images with deep learning
Bằng: de Bem, RA
Được phát hành: (2018)

Learning to understand large-scale 3D point clouds
Bằng: Qingyong, H
Được phát hành: (2022)

Self-supervised learning using motion and visualizing convolutional neural networks
Bằng: Mahendran, A
Được phát hành: (2018)

Visual recognition in art using machine learning
Bằng: Crowley, E
Được phát hành: (2017)

DEEP LEARNING-BASED MODEL FOR CLASSIFICATION OF BEAN NITROGEN STATUS USING DIGITAL CANOPY IMAGING
Bằng: Murilo M. Baesso, et al.
Được phát hành: (2023-06-01)

Seeing What You’re Told: Sentence-Guided Activity Recognition In Video
Bằng: Siddharth, Narayanaswamy, et al.
Được phát hành: (2015)

An Interpretable Deep Learning-Based Feature Reduction in Video-Based Human Activity Recognition
Bằng: Micheal Dutt, et al.
Được phát hành: (2024-01-01)

A Survey on Audio-Video Based Defect Detection Through Deep Learning in Railway Maintenance
Bằng: Lorenzo De Donato, et al.
Được phát hành: (2022-01-01)

On the Generalization of Deep Learning Models in Video Deepfake Detection
Bằng: Davide Alessandro Coccomini, et al.
Được phát hành: (2023-04-01)

Automatic Detection for Acromegaly Using Hand Photographs: A Deep-Learning Approach
Bằng: Chengbin Duan, et al.
Được phát hành: (2021-01-01)

HyMNet: A Multimodal Deep Learning System for Hypertension Prediction Using Fundus Images and Cardiometabolic Risk Factors
Bằng: Mohammed Baharoon, et al.
Được phát hành: (2024-10-01)

Corrigendum: Deep Plant Phenomics: A Deep Learning Platform for Complex Plant Phenotyping Tasks
Bằng: Jordan R. Ubbens, et al.
Được phát hành: (2018-01-01)

Scalable learning for expanding robot vision
Bằng: Porav, H
Được phát hành: (2020)

Robust 2D and 3D registration with deep neural networks
Bằng: Wang, Z
Được phát hành: (2024)

Learning shape from images
Bằng: Wiles, O
Được phát hành: (2020)

Unsupervised learning of clutter-resistant visual representations from natural videos
Bằng: Liao, Qianli, et al.
Được phát hành: (2015)

Challenges and Applications for Implementing Machine Learning in Computer Vision /
Bằng: Kashyap, Ramgopal, 1984- editor., et al.
Được phát hành: ([202)

Understanding Mixup Training Methods
Bằng: Daojun Liang, et al.
Được phát hành: (2018-01-01)

Multimodal Image-Based Indoor Localization with Machine Learning—A Systematic Review
Bằng: Szymon Łukasik, et al.
Được phát hành: (2024-09-01)

Structured learning and prediction in computer vision /
Bằng: 525432 Nowozin, Sebastian, et al.
Được phát hành: (2011)

A Dataset of apical periodontitis lesions in panoramic radiographs for deep-learning-based classification and detection
Bằng: Hoang Viet Do, et al.
Được phát hành: (2024-06-01)

Use and examination of convolutional neural networks for scene understanding
Bằng: Jetley, S
Được phát hành: (2018)

Deep Learning Architecture Reduction for fMRI Data
Bằng: Ruben Alvarez-Gonzalez, et al.
Được phát hành: (2022-02-01)

Unsupervised learning of 3d objects in the wild
Bằng: Wu, S
Được phát hành: (2022)

Deep learning based computer vision approaches for smart agricultural applications
Bằng: V.G. Dhanya, et al.
Được phát hành: (2022-01-01)

Classification of protected grassland habitats using deep learning architectures on Sentinel-2 satellite imagery data
Bằng: Gabriel Díaz-Ireland, et al.
Được phát hành: (2024-11-01)

Computer vision and machine learning with RGB-D sensors /
Bằng: Shao, Ling
Được phát hành: (c201)

Multimodal Deep Learning Integration of Image, Weather, and Phenotypic Data Under Temporal Effects for Early Prediction of Maize Yield
Bằng: Danial Shamsuddin, et al.
Được phát hành: (2024-10-01)

Weakly-supervised learning for video understanding
Bằng: Deng, Dingfan
Được phát hành: (2023)