Generative models for robotic arm motion

Currently, human-computer interaction technology is rapidly advancing, with one popular topic being the imitation of human arm movements by robotic arms to achieve visual behavior replication. With continuous technological progress, robotic arms are now able of learning and mimicking movements by vi...

Full description

Bibliographic Details
Main Author: Xu, Wenjie
Other Authors: Wen Bihan
Format: Thesis-Master by Coursework
Language:English
Published: Nanyang Technological University 2024
Subjects:
Online Access:https://hdl.handle.net/10356/180161
_version_ 1811676695351001088
author Xu, Wenjie
author2 Wen Bihan
author_facet Wen Bihan
Xu, Wenjie
author_sort Xu, Wenjie
collection NTU
description Currently, human-computer interaction technology is rapidly advancing, with one popular topic being the imitation of human arm movements by robotic arms to achieve visual behavior replication. With continuous technological progress, robotic arms are now able of learning and mimicking movements by videos or images given to them. This paper proposes three methods that utilize cross domain transformation and image generation techniques to convert videos of human arm movements into robotic arm action videos. These methods enable real robotic arms to imitate the movements, making progress to the development of interaction between human and machine. We process the videos into frames and use generative adversarial networks and a contrastive learning framework to maximize the mutual information between image patches in the input and output domains, thereby effectively achieving cross-domain transformation. Additionally, the paper explores the approach of direct video-to-video transformation. By employing a Stable Diffusion text-to image model combined with spatio-temporal attention mechanisms, the generated videos align well with the commands given to the model. This not only broadens the application scope of the model but also provides insights into tasks involving cross-domain transformation, opening up more possibilities for robotic arm learning.
first_indexed 2024-10-01T02:25:33Z
format Thesis-Master by Coursework
id ntu-10356/180161
institution Nanyang Technological University
language English
last_indexed 2024-10-01T02:25:33Z
publishDate 2024
publisher Nanyang Technological University
record_format dspace
spelling ntu-10356/1801612024-09-27T15:43:16Z Generative models for robotic arm motion Xu, Wenjie Wen Bihan School of Electrical and Electronic Engineering bihan.wen@ntu.edu.sg Engineering Image generation Currently, human-computer interaction technology is rapidly advancing, with one popular topic being the imitation of human arm movements by robotic arms to achieve visual behavior replication. With continuous technological progress, robotic arms are now able of learning and mimicking movements by videos or images given to them. This paper proposes three methods that utilize cross domain transformation and image generation techniques to convert videos of human arm movements into robotic arm action videos. These methods enable real robotic arms to imitate the movements, making progress to the development of interaction between human and machine. We process the videos into frames and use generative adversarial networks and a contrastive learning framework to maximize the mutual information between image patches in the input and output domains, thereby effectively achieving cross-domain transformation. Additionally, the paper explores the approach of direct video-to-video transformation. By employing a Stable Diffusion text-to image model combined with spatio-temporal attention mechanisms, the generated videos align well with the commands given to the model. This not only broadens the application scope of the model but also provides insights into tasks involving cross-domain transformation, opening up more possibilities for robotic arm learning. Master's degree 2024-09-24T08:06:12Z 2024-09-24T08:06:12Z 2024 Thesis-Master by Coursework Xu, W. (2024). Generative models for robotic arm motion. Master's thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/180161 https://hdl.handle.net/10356/180161 en application/pdf Nanyang Technological University
spellingShingle Engineering
Image generation
Xu, Wenjie
Generative models for robotic arm motion
title Generative models for robotic arm motion
title_full Generative models for robotic arm motion
title_fullStr Generative models for robotic arm motion
title_full_unstemmed Generative models for robotic arm motion
title_short Generative models for robotic arm motion
title_sort generative models for robotic arm motion
topic Engineering
Image generation
url https://hdl.handle.net/10356/180161
work_keys_str_mv AT xuwenjie generativemodelsforroboticarmmotion