Few-Shot Anomaly Detection via Personalization

Even with a plenty amount of normal samples, anomaly detection has been considered as a challenging machine learning task due to its one-class nature, i. e., the lack of anomalous samples in training time. It is only recently that a few-shot regime of anomaly detection became feasible in this regard...

Full description

Bibliographic Details
Main Authors:	Sangkyung Kwak, Jongheon Jeong, Hankook Lee, Woohyuck Kim, Dongho Seo, Woojin Yun, Wonjin Lee, Jinwoo Shin
Format:	Article
Language:	English
Published:	IEEE 2024-01-01
Series:	IEEE Access
Subjects:	Industrial anomaly detection model personalization text-to-image diffusion model vision-language model
Online Access:	https://ieeexplore.ieee.org/document/10401164/

_version_	1827375786944888832
author	Sangkyung Kwak Jongheon Jeong Hankook Lee Woohyuck Kim Dongho Seo Woojin Yun Wonjin Lee Jinwoo Shin
author_facet	Sangkyung Kwak Jongheon Jeong Hankook Lee Woohyuck Kim Dongho Seo Woojin Yun Wonjin Lee Jinwoo Shin
author_sort	Sangkyung Kwak
collection	DOAJ
description	Even with a plenty amount of normal samples, anomaly detection has been considered as a challenging machine learning task due to its one-class nature, i. e., the lack of anomalous samples in training time. It is only recently that a few-shot regime of anomaly detection became feasible in this regard, e. g., with a help from large vision-language pre-trained models such as CLIP, despite its wide applicability. In this paper, we explore the potential of large text-to-image generative models in performing few-shot industrial anomaly detection. Specifically, recent text-to-image models have shown unprecedented ability to generalize from few images to extract their common and unique concepts, and even encode them into a textual token to “personalize” the model: so-called textual inversion. Here, we question whether this personalization is specific enough to discriminate the given images from their potential anomalies, which are often, e. g., open-ended, local, and hard-to-detect. We observe that standard textual inversion exhibits a weaker understanding in localized details within objects, which is not enough for detecting industrial anomalies accurately. Thus, we explore the utilization of model personalization to address anomaly detection and propose Anomaly Detection via Personalization (ADP). ADP enables extracting fine-grained local details shared in the images with simple-yet an effective regularization scheme from the zero-shot transferability of CLIP. We also propose a self-tuning scheme to further optimize the performance of our detection pipeline, leveraging synthetic data generated from the personalized generative model. Our experiments show that the proposed inversion scheme could achieve state-of-the-art results on two industrial anomaly benchmarks, MVTec-AD and VisA, in the regime of few normal samples.
first_indexed	2024-03-08T11:57:15Z
format	Article
id	doaj.art-21ba7c8ea47a4ea999ac2a0abd1b0fd4
institution	Directory Open Access Journal
issn	2169-3536
language	English
last_indexed	2024-03-08T11:57:15Z
publishDate	2024-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj.art-21ba7c8ea47a4ea999ac2a0abd1b0fd42024-01-24T00:00:44ZengIEEEIEEE Access2169-35362024-01-0112110351105110.1109/ACCESS.2024.335502110401164Few-Shot Anomaly Detection via PersonalizationSangkyung Kwak0https://orcid.org/0000-0001-9145-5876Jongheon Jeong1https://orcid.org/0000-0002-4058-5774Hankook Lee2https://orcid.org/0009-0004-5959-9908Woohyuck Kim3https://orcid.org/0009-0007-6053-6332Dongho Seo4https://orcid.org/0000-0002-3394-3422Woojin Yun5Wonjin Lee6Jinwoo Shin7https://orcid.org/0000-0003-4313-4669Kim Jaechul Graduate School of AI, Korea Advanced Institute of Science and Technology, Daejeon, Republic of KoreaKim Jaechul Graduate School of AI, Korea Advanced Institute of Science and Technology, Daejeon, Republic of KoreaLG AI Research, Seoul, Republic of KoreaKim Jaechul Graduate School of AI, Korea Advanced Institute of Science and Technology, Daejeon, Republic of KoreaLIG Nex1, Geonggi, Republic of KoreaLIG Nex1, Geonggi, Republic of KoreaLIG Nex1, Geonggi, Republic of KoreaKim Jaechul Graduate School of AI, Korea Advanced Institute of Science and Technology, Daejeon, Republic of KoreaEven with a plenty amount of normal samples, anomaly detection has been considered as a challenging machine learning task due to its one-class nature, i. e., the lack of anomalous samples in training time. It is only recently that a few-shot regime of anomaly detection became feasible in this regard, e. g., with a help from large vision-language pre-trained models such as CLIP, despite its wide applicability. In this paper, we explore the potential of large text-to-image generative models in performing few-shot industrial anomaly detection. Specifically, recent text-to-image models have shown unprecedented ability to generalize from few images to extract their common and unique concepts, and even encode them into a textual token to “personalize” the model: so-called textual inversion. Here, we question whether this personalization is specific enough to discriminate the given images from their potential anomalies, which are often, e. g., open-ended, local, and hard-to-detect. We observe that standard textual inversion exhibits a weaker understanding in localized details within objects, which is not enough for detecting industrial anomalies accurately. Thus, we explore the utilization of model personalization to address anomaly detection and propose Anomaly Detection via Personalization (ADP). ADP enables extracting fine-grained local details shared in the images with simple-yet an effective regularization scheme from the zero-shot transferability of CLIP. We also propose a self-tuning scheme to further optimize the performance of our detection pipeline, leveraging synthetic data generated from the personalized generative model. Our experiments show that the proposed inversion scheme could achieve state-of-the-art results on two industrial anomaly benchmarks, MVTec-AD and VisA, in the regime of few normal samples.https://ieeexplore.ieee.org/document/10401164/Industrial anomaly detectionmodel personalizationtext-to-image diffusion modelvision-language model
spellingShingle	Sangkyung Kwak Jongheon Jeong Hankook Lee Woohyuck Kim Dongho Seo Woojin Yun Wonjin Lee Jinwoo Shin Few-Shot Anomaly Detection via Personalization IEEE Access Industrial anomaly detection model personalization text-to-image diffusion model vision-language model
title	Few-Shot Anomaly Detection via Personalization
title_full	Few-Shot Anomaly Detection via Personalization
title_fullStr	Few-Shot Anomaly Detection via Personalization
title_full_unstemmed	Few-Shot Anomaly Detection via Personalization
title_short	Few-Shot Anomaly Detection via Personalization
title_sort	few shot anomaly detection via personalization
topic	Industrial anomaly detection model personalization text-to-image diffusion model vision-language model
url	https://ieeexplore.ieee.org/document/10401164/
work_keys_str_mv	AT sangkyungkwak fewshotanomalydetectionviapersonalization AT jongheonjeong fewshotanomalydetectionviapersonalization AT hankooklee fewshotanomalydetectionviapersonalization AT woohyuckkim fewshotanomalydetectionviapersonalization AT donghoseo fewshotanomalydetectionviapersonalization AT woojinyun fewshotanomalydetectionviapersonalization AT wonjinlee fewshotanomalydetectionviapersonalization AT jinwooshin fewshotanomalydetectionviapersonalization

Few-Shot Anomaly Detection via Personalization

Similar Items