Unsupervised learning with diffusion models

In computer vision, a key goal is to obtain visual representations that faithfully capture the underlying structure and semantics of the data, encompassing object identities, positions, textures, and lighting conditions. However, existing methods for un-/self-supervised learning (SSL) are restricted...

ver descrição completa

Detalhes bibliográficos
Autor principal:	Wang, Jiankun
Outros Autores:	Weichen Liu
Formato:	Thesis-Master by Research
Idioma:	English
Publicado em:	Nanyang Technological University 2023
Assuntos:	Engineering::Computer science and engineering
Acesso em linha:	https://hdl.handle.net/10356/171953

_version_	1826117014777757696
author	Wang, Jiankun
author2	Weichen Liu
author_facet	Weichen Liu Wang, Jiankun
author_sort	Wang, Jiankun
collection	NTU
description	In computer vision, a key goal is to obtain visual representations that faithfully capture the underlying structure and semantics of the data, encompassing object identities, positions, textures, and lighting conditions. However, existing methods for un-/self-supervised learning (SSL) are restricted to untangling basic augmentation attributes such as rotation and color modification, which constrains their capacity to efficiently modularize the underlying semantics. In the thesis, we propose DiffSiam, a novel SSL framework that incorporates a disentangled representation learning algorithm based on diffusion models. By introducing additional Gaussian noises during the diffusion forward process, DiffSiam collapses samples with similar attributes, intensifying the attribute loss. To compensate, we learn an expanding set of modular features over time, adhering to the reconstruction of the Diffusion Model. This training dynamics biases the learned features towards disentangling diverse semantics, from fine-grained to coarse-grained attributes. Experimental results demonstrate the superior performance of DiffSiam on various classification benchmarks and generative tasks, validating its effectiveness in generating disentangled representations.
first_indexed	2024-10-01T04:20:50Z
format	Thesis-Master by Research
id	ntu-10356/171953
institution	Nanyang Technological University
language	English
last_indexed	2024-10-01T04:20:50Z
publishDate	2023
publisher	Nanyang Technological University
record_format	dspace
spelling	ntu-10356/1719532023-12-01T01:52:37Z Unsupervised learning with diffusion models Wang, Jiankun Weichen Liu School of Computer Science and Engineering liu@ntu.edu.sg Engineering::Computer science and engineering In computer vision, a key goal is to obtain visual representations that faithfully capture the underlying structure and semantics of the data, encompassing object identities, positions, textures, and lighting conditions. However, existing methods for un-/self-supervised learning (SSL) are restricted to untangling basic augmentation attributes such as rotation and color modification, which constrains their capacity to efficiently modularize the underlying semantics. In the thesis, we propose DiffSiam, a novel SSL framework that incorporates a disentangled representation learning algorithm based on diffusion models. By introducing additional Gaussian noises during the diffusion forward process, DiffSiam collapses samples with similar attributes, intensifying the attribute loss. To compensate, we learn an expanding set of modular features over time, adhering to the reconstruction of the Diffusion Model. This training dynamics biases the learned features towards disentangling diverse semantics, from fine-grained to coarse-grained attributes. Experimental results demonstrate the superior performance of DiffSiam on various classification benchmarks and generative tasks, validating its effectiveness in generating disentangled representations. Master of Engineering 2023-11-17T04:16:22Z 2023-11-17T04:16:22Z 2023 Thesis-Master by Research Wang, J. (2023). Unsupervised learning with diffusion models. Master's thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/171953 https://hdl.handle.net/10356/171953 10.32657/10356/171953 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University
spellingShingle	Engineering::Computer science and engineering Wang, Jiankun Unsupervised learning with diffusion models
title	Unsupervised learning with diffusion models
title_full	Unsupervised learning with diffusion models
title_fullStr	Unsupervised learning with diffusion models
title_full_unstemmed	Unsupervised learning with diffusion models
title_short	Unsupervised learning with diffusion models
title_sort	unsupervised learning with diffusion models
topic	Engineering::Computer science and engineering
url	https://hdl.handle.net/10356/171953
work_keys_str_mv	AT wangjiankun unsupervisedlearningwithdiffusionmodels

Unsupervised learning with diffusion models

Registros relacionados