Learning to anticipate and forecast human actions from videos

Action Anticipation and forecasting aims to predict future actions by processing videos containing past and current observations. In this project, we develop new methods based on the encoder-decoder architecture with Transformer models to anticipate and forecast future human actions by proce...

Full description

Bibliographic Details
Main Author: Peh, Eric Zheng Quan
Other Authors: Soh Cheong Boon
Format: Final Year Project (FYP)
Language:English
Published: Nanyang Technological University 2022
Subjects:
Online Access:https://hdl.handle.net/10356/158618
Description
Summary:Action Anticipation and forecasting aims to predict future actions by processing videos containing past and current observations. In this project, we develop new methods based on the encoder-decoder architecture with Transformer models to anticipate and forecast future human actions by processing videos. The model will observe a video for several seconds (or minutes) and then encodes information of the video to predict plausible human action that are going to happen in the future. Temporal information from videos will be extracted from deep neural networks. The performance of these models will then be evaluated on standard action forecasting datasets such as Breakfast and 50Salads datasets