Cross-Scale KNN Image Transformer for Image Restoration
Numerous image restoration approaches have been proposed based on attention mechanism, achieving superior performance to convolutional neural networks (CNNs) based counterparts. However, they do not leverage the attention model in a form fully suited to the image restoration tasks. In this paper, we...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2023-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10036436/ |
_version_ | 1811164994872541184 |
---|---|
author | Hunsang Lee Hyesong Choi Kwanghoon Sohn Dongbo Min |
author_facet | Hunsang Lee Hyesong Choi Kwanghoon Sohn Dongbo Min |
author_sort | Hunsang Lee |
collection | DOAJ |
description | Numerous image restoration approaches have been proposed based on attention mechanism, achieving superior performance to convolutional neural networks (CNNs) based counterparts. However, they do not leverage the attention model in a form fully suited to the image restoration tasks. In this paper, we propose an image restoration network with a novel attention mechanism, called cross-scale <inline-formula> <tex-math notation="LaTeX">$k$ </tex-math></inline-formula>-NN image Transformer (CS-KiT), that effectively considers several factors such as locality, non-locality, and cross-scale aggregation, which are essential to image restoration. To achieve locality and non-locality, the CS-KiT builds <inline-formula> <tex-math notation="LaTeX">$k$ </tex-math></inline-formula>-nearest neighbor relation of local patches and aggregates similar patches through local attention. To induce cross-scale aggregation, we ensure that each local patch embraces different scale information with scale-aware patch embedding (SPE) which predicts an input patch scale through a combination of multi-scale convolution branches. We show the effectiveness of the CS-KiT with experimental results, outperforming state-of-the-art restoration approaches on image denoising, deblurring, and deraining benchmarks. |
first_indexed | 2024-04-10T15:30:22Z |
format | Article |
id | doaj.art-8ef10f4b261f4d9eb4d013dd835c8e53 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-04-10T15:30:22Z |
publishDate | 2023-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-8ef10f4b261f4d9eb4d013dd835c8e532023-02-14T00:00:50ZengIEEEIEEE Access2169-35362023-01-0111130131302710.1109/ACCESS.2023.324255610036436Cross-Scale KNN Image Transformer for Image RestorationHunsang Lee0https://orcid.org/0000-0002-6670-5455Hyesong Choi1Kwanghoon Sohn2https://orcid.org/0000-0002-3715-0331Dongbo Min3https://orcid.org/0000-0003-4825-5240School of Electrical and Electronic Engineering, Yonsei University, Seoul, South KoreaDepartment of Computer Science and Engineering, Ewha Womans University, Seoul, South KoreaSchool of Electrical and Electronic Engineering, Yonsei University, Seoul, South KoreaDepartment of Computer Science and Engineering, Ewha Womans University, Seoul, South KoreaNumerous image restoration approaches have been proposed based on attention mechanism, achieving superior performance to convolutional neural networks (CNNs) based counterparts. However, they do not leverage the attention model in a form fully suited to the image restoration tasks. In this paper, we propose an image restoration network with a novel attention mechanism, called cross-scale <inline-formula> <tex-math notation="LaTeX">$k$ </tex-math></inline-formula>-NN image Transformer (CS-KiT), that effectively considers several factors such as locality, non-locality, and cross-scale aggregation, which are essential to image restoration. To achieve locality and non-locality, the CS-KiT builds <inline-formula> <tex-math notation="LaTeX">$k$ </tex-math></inline-formula>-nearest neighbor relation of local patches and aggregates similar patches through local attention. To induce cross-scale aggregation, we ensure that each local patch embraces different scale information with scale-aware patch embedding (SPE) which predicts an input patch scale through a combination of multi-scale convolution branches. We show the effectiveness of the CS-KiT with experimental results, outperforming state-of-the-art restoration approaches on image denoising, deblurring, and deraining benchmarks.https://ieeexplore.ieee.org/document/10036436/Image restorationdenoisingdeblurringderainingtransformerself-attention |
spellingShingle | Hunsang Lee Hyesong Choi Kwanghoon Sohn Dongbo Min Cross-Scale KNN Image Transformer for Image Restoration IEEE Access Image restoration denoising deblurring deraining transformer self-attention |
title | Cross-Scale KNN Image Transformer for Image Restoration |
title_full | Cross-Scale KNN Image Transformer for Image Restoration |
title_fullStr | Cross-Scale KNN Image Transformer for Image Restoration |
title_full_unstemmed | Cross-Scale KNN Image Transformer for Image Restoration |
title_short | Cross-Scale KNN Image Transformer for Image Restoration |
title_sort | cross scale knn image transformer for image restoration |
topic | Image restoration denoising deblurring deraining transformer self-attention |
url | https://ieeexplore.ieee.org/document/10036436/ |
work_keys_str_mv | AT hunsanglee crossscaleknnimagetransformerforimagerestoration AT hyesongchoi crossscaleknnimagetransformerforimagerestoration AT kwanghoonsohn crossscaleknnimagetransformerforimagerestoration AT dongbomin crossscaleknnimagetransformerforimagerestoration |