Disentangling latent space of variational autoencoder with distribution dependent guarantees for out-of-distribution detection and reasoning

Cyber-physical systems (CPS) have diverse applications, especially in a safety-critical setting, such as autonomous cars (AV). In safety-critical systems, any mistake can lead to non-compensable results, such as losing individuals. Therefore, ensuring the safety of such systems is vital. Many saf...

Full description

Bibliographic Details
Main Author: Rahiminasab Zahra Reza (Zahra Rahiminasab)
Other Authors: Arvind Easwaran
Format: Thesis-Doctor of Philosophy
Language:English
Published: Nanyang Technological University 2024
Subjects:
Online Access:https://hdl.handle.net/10356/172959
_version_ 1811680624045457408
author Rahiminasab Zahra Reza (Zahra Rahiminasab)
author2 Arvind Easwaran
author_facet Arvind Easwaran
Rahiminasab Zahra Reza (Zahra Rahiminasab)
author_sort Rahiminasab Zahra Reza (Zahra Rahiminasab)
collection NTU
description Cyber-physical systems (CPS) have diverse applications, especially in a safety-critical setting, such as autonomous cars (AV). In safety-critical systems, any mistake can lead to non-compensable results, such as losing individuals. Therefore, ensuring the safety of such systems is vital. Many safety-critical CPS use machine learning (ML) models to accomplish their goals, such as in object detection, image segmentation, etc. As a result, there is a need for approaches to examine the safety of generated predictions by such ML models. One of the presented problems in examining the safety of ML models is the outof- distribution (OOD) problem. In the OOD problem, the objective is to identify test samples that are drawn from distributions different from the distribution of training samples. For example, in an object detection application, if the ML model is trained with day images and the test sample is a night image, the test sample should be identified as OOD. Solving the OOD problem itself can be divided into two subproblems: OOD detection and OOD reasoning. In OOD detection, the goal is to identify whether or not the sample is OOD. In contrast, in OOD reasoning, the objective is to explain the OOD behavior. Approaches based on one-class classifiers have poor performance on real-world datasets with multi-label data and overlapping partitions. For example, in the nuScenes dataset, each image has multiple labels such as pedestrian presence, weather, etc. An image belonging to the low rain intensity partition can also belong to the no-pedestrian partition. As a result, in our first attempt, we create an ensemble of β-Variational autoencoders (EBVAE) as OOD detectors that can be trained with multi-label data. A β-variational autoencoder (β-VAE) model maps each input to lower dimensional latent representation and reconstructs the image based on learned representation. Each β-VAE in the ensemble learns a representation corresponding to one generative factor. Generative factors are critical factors in the image that are necessary for image reconstruction. This thesis focuses on meteorological (such as rain intensity) and background generative factors. Addressing the OOD problem for these factors is challenging as these factors affect all the pixels of the image and can be dependent. In each β-VAE, we identify the most sensitive representation dimension for each generative factor. Disentanglement is establishing one-to-many maps between generative factors and their representative latent dimensions. Disentangling the latent space of VAE or its variants is a key step for OOD reasoning when we use one VAE to learn known generative factors corresponding to the meteorological features and background of an image. Disentangling the latent space of VAE is only possible with bias on the model or supervision for data. In our second attempt to resolve the OOD detection and reasoning problem for multi-label data, we use one VAE to decrease OOD inference overhead. We enforce bias on the OOD model through hyperparameter tuning to disentangle generative factors. We call obtained model hyperparameter based disentangled β-VAE (HPVAE). We use change point detection approaches to consider the effect of time dependency between data samples on OOD detection and reasoning. HPVAE based solution generates one-to-many maps between generative factors and their corresponding latent dimensions that can differ during training and inference time. As a result, in our third effort to solve the OOD detection and reasoning problem, we augment disentanglement to the training process by adding disentanglement constraints as regularization terms to the loss function. We use matchpairing weak supervision for training the VAE with disentangled latent space. In the match-pairing setting, samples are divided into groups in which samples from the same group have the same value or range of values for specific generative factors. Achieving total disentanglement, even with supervision, is impossible in practice due to the presence of unknown generative factors, dependencies between different generative factors, etc. The disentanglement constraints are formed by fuzzy logic since using fuzzy logic helps to formalize partial disentanglement. We call the trained model by this approach weakly supervised logic variational autoencoder (WDLVAE). Finally, we propose the disentangled distilled VAE (DDV) to reduce the model size while preserving disentanglement properties. The motivation behind introducing this framework is to reduce required resources when we deploy OOD reasoners and detectors on resource-constrained devices. For model compression, we use student-teacher knowledge distillation. To ensure the disentanglement is preserved during model compression, we define the problem as a constrained optimization problem with disentanglement constraints. To provide a theoretical guarantee for disentanglement during distillation, we analyze the optimality of obtained solutions and use generalization bounds. We also evaluate our approach empirically by deploying the compressed model on a resource-constrained device.
first_indexed 2024-10-01T03:28:00Z
format Thesis-Doctor of Philosophy
id ntu-10356/172959
institution Nanyang Technological University
language English
last_indexed 2024-10-01T03:28:00Z
publishDate 2024
publisher Nanyang Technological University
record_format dspace
spelling ntu-10356/1729592024-02-01T09:53:44Z Disentangling latent space of variational autoencoder with distribution dependent guarantees for out-of-distribution detection and reasoning Rahiminasab Zahra Reza (Zahra Rahiminasab) Arvind Easwaran School of Computer Science and Engineering arvinde@ntu.edu.sg Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence Cyber-physical systems (CPS) have diverse applications, especially in a safety-critical setting, such as autonomous cars (AV). In safety-critical systems, any mistake can lead to non-compensable results, such as losing individuals. Therefore, ensuring the safety of such systems is vital. Many safety-critical CPS use machine learning (ML) models to accomplish their goals, such as in object detection, image segmentation, etc. As a result, there is a need for approaches to examine the safety of generated predictions by such ML models. One of the presented problems in examining the safety of ML models is the outof- distribution (OOD) problem. In the OOD problem, the objective is to identify test samples that are drawn from distributions different from the distribution of training samples. For example, in an object detection application, if the ML model is trained with day images and the test sample is a night image, the test sample should be identified as OOD. Solving the OOD problem itself can be divided into two subproblems: OOD detection and OOD reasoning. In OOD detection, the goal is to identify whether or not the sample is OOD. In contrast, in OOD reasoning, the objective is to explain the OOD behavior. Approaches based on one-class classifiers have poor performance on real-world datasets with multi-label data and overlapping partitions. For example, in the nuScenes dataset, each image has multiple labels such as pedestrian presence, weather, etc. An image belonging to the low rain intensity partition can also belong to the no-pedestrian partition. As a result, in our first attempt, we create an ensemble of β-Variational autoencoders (EBVAE) as OOD detectors that can be trained with multi-label data. A β-variational autoencoder (β-VAE) model maps each input to lower dimensional latent representation and reconstructs the image based on learned representation. Each β-VAE in the ensemble learns a representation corresponding to one generative factor. Generative factors are critical factors in the image that are necessary for image reconstruction. This thesis focuses on meteorological (such as rain intensity) and background generative factors. Addressing the OOD problem for these factors is challenging as these factors affect all the pixels of the image and can be dependent. In each β-VAE, we identify the most sensitive representation dimension for each generative factor. Disentanglement is establishing one-to-many maps between generative factors and their representative latent dimensions. Disentangling the latent space of VAE or its variants is a key step for OOD reasoning when we use one VAE to learn known generative factors corresponding to the meteorological features and background of an image. Disentangling the latent space of VAE is only possible with bias on the model or supervision for data. In our second attempt to resolve the OOD detection and reasoning problem for multi-label data, we use one VAE to decrease OOD inference overhead. We enforce bias on the OOD model through hyperparameter tuning to disentangle generative factors. We call obtained model hyperparameter based disentangled β-VAE (HPVAE). We use change point detection approaches to consider the effect of time dependency between data samples on OOD detection and reasoning. HPVAE based solution generates one-to-many maps between generative factors and their corresponding latent dimensions that can differ during training and inference time. As a result, in our third effort to solve the OOD detection and reasoning problem, we augment disentanglement to the training process by adding disentanglement constraints as regularization terms to the loss function. We use matchpairing weak supervision for training the VAE with disentangled latent space. In the match-pairing setting, samples are divided into groups in which samples from the same group have the same value or range of values for specific generative factors. Achieving total disentanglement, even with supervision, is impossible in practice due to the presence of unknown generative factors, dependencies between different generative factors, etc. The disentanglement constraints are formed by fuzzy logic since using fuzzy logic helps to formalize partial disentanglement. We call the trained model by this approach weakly supervised logic variational autoencoder (WDLVAE). Finally, we propose the disentangled distilled VAE (DDV) to reduce the model size while preserving disentanglement properties. The motivation behind introducing this framework is to reduce required resources when we deploy OOD reasoners and detectors on resource-constrained devices. For model compression, we use student-teacher knowledge distillation. To ensure the disentanglement is preserved during model compression, we define the problem as a constrained optimization problem with disentanglement constraints. To provide a theoretical guarantee for disentanglement during distillation, we analyze the optimality of obtained solutions and use generalization bounds. We also evaluate our approach empirically by deploying the compressed model on a resource-constrained device. Doctor of Philosophy 2024-01-08T05:31:17Z 2024-01-08T05:31:17Z 2023 Thesis-Doctor of Philosophy Rahiminasab Zahra Reza (Zahra Rahiminasab) (2023). Disentangling latent space of variational autoencoder with distribution dependent guarantees for out-of-distribution detection and reasoning. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/172959 https://hdl.handle.net/10356/172959 10.32657/10356/172959 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University
spellingShingle Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision
Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
Rahiminasab Zahra Reza (Zahra Rahiminasab)
Disentangling latent space of variational autoencoder with distribution dependent guarantees for out-of-distribution detection and reasoning
title Disentangling latent space of variational autoencoder with distribution dependent guarantees for out-of-distribution detection and reasoning
title_full Disentangling latent space of variational autoencoder with distribution dependent guarantees for out-of-distribution detection and reasoning
title_fullStr Disentangling latent space of variational autoencoder with distribution dependent guarantees for out-of-distribution detection and reasoning
title_full_unstemmed Disentangling latent space of variational autoencoder with distribution dependent guarantees for out-of-distribution detection and reasoning
title_short Disentangling latent space of variational autoencoder with distribution dependent guarantees for out-of-distribution detection and reasoning
title_sort disentangling latent space of variational autoencoder with distribution dependent guarantees for out of distribution detection and reasoning
topic Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision
Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
url https://hdl.handle.net/10356/172959
work_keys_str_mv AT rahiminasabzahrarezazahrarahiminasab disentanglinglatentspaceofvariationalautoencoderwithdistributiondependentguaranteesforoutofdistributiondetectionandreasoning