Video super‐resolution with non‐local alignment network

Abstract Video super‐resolution (VSR) aims at recovering high‐resolution frames from their low‐resolution counterparts. Over the past few years, deep neural networks have dominated the video super‐resolution task because of its strong non‐linear representational ability. To exploit temporal correlat...

Full description

Bibliographic Details
Main Authors:	Chao Zhou, Can Chen, Fei Ding, Dengyin Zhang
Format:	Article
Language:	English
Published:	Wiley 2021-06-01
Series:	IET Image Processing
Online Access:	https://doi.org/10.1049/ipr2.12134

_version_	1811251999473139712
author	Chao Zhou Can Chen Fei Ding Dengyin Zhang
author_facet	Chao Zhou Can Chen Fei Ding Dengyin Zhang
author_sort	Chao Zhou
collection	DOAJ
description	Abstract Video super‐resolution (VSR) aims at recovering high‐resolution frames from their low‐resolution counterparts. Over the past few years, deep neural networks have dominated the video super‐resolution task because of its strong non‐linear representational ability. To exploit temporal correlations, most deep neural networks have to face two challenges: (1) how to align consecutive frames containing motions, occlusions and blurring, and establish accurate temporal correspondences, (2) how to effectively fuse aligned frames and balance their contributions. In this work, a novel video super‐resolution network, named NLVSR, is proposed to solve above problems in an efficient and effective manner. For alignment, a temporal‐spatial non‐local operation is employed to align each frame to the reference frame. Compared with existing alignment approaches, the proposed temporal‐spatial non‐local operation is able to integrate the global information of each frame by a weighted sum, leading to a better performance in alignment. For fusion, an attention‐based progressive fusion framework was designed to integrate aligned frames gradually. To penalize the points with low‐quality in aligned features, an attention mechanism was employed for a robust reconstruction. Experimental results demonstrate the superiority of the proposed network in terms of quantitative and qualitative evaluation, and surpasses other state‐of‐the‐art methods by 0.33 dB at least.
first_indexed	2024-04-12T16:27:25Z
format	Article
id	doaj.art-6137d0fd8af84d11aed997b88c405dfd
institution	Directory Open Access Journal
issn	1751-9659 1751-9667
language	English
last_indexed	2024-04-12T16:27:25Z
publishDate	2021-06-01
publisher	Wiley
record_format	Article
series	IET Image Processing
spelling	doaj.art-6137d0fd8af84d11aed997b88c405dfd2022-12-22T03:25:18ZengWileyIET Image Processing1751-96591751-96672021-06-011581655166710.1049/ipr2.12134Video super‐resolution with non‐local alignment networkChao Zhou0Can Chen1Fei Ding2Dengyin Zhang3School of Telecommunications & Information Engineering Nanjing University of Posts and Telecommunications Nanjing 210003 ChinaSchool of Telecommunications & Information Engineering Nanjing University of Posts and Telecommunications Nanjing 210003 ChinaSchool of Internet of Things Nanjing University of Posts and Telecommunications Nanjing 210003 ChinaSchool of Internet of Things Nanjing University of Posts and Telecommunications Nanjing 210003 ChinaAbstract Video super‐resolution (VSR) aims at recovering high‐resolution frames from their low‐resolution counterparts. Over the past few years, deep neural networks have dominated the video super‐resolution task because of its strong non‐linear representational ability. To exploit temporal correlations, most deep neural networks have to face two challenges: (1) how to align consecutive frames containing motions, occlusions and blurring, and establish accurate temporal correspondences, (2) how to effectively fuse aligned frames and balance their contributions. In this work, a novel video super‐resolution network, named NLVSR, is proposed to solve above problems in an efficient and effective manner. For alignment, a temporal‐spatial non‐local operation is employed to align each frame to the reference frame. Compared with existing alignment approaches, the proposed temporal‐spatial non‐local operation is able to integrate the global information of each frame by a weighted sum, leading to a better performance in alignment. For fusion, an attention‐based progressive fusion framework was designed to integrate aligned frames gradually. To penalize the points with low‐quality in aligned features, an attention mechanism was employed for a robust reconstruction. Experimental results demonstrate the superiority of the proposed network in terms of quantitative and qualitative evaluation, and surpasses other state‐of‐the‐art methods by 0.33 dB at least.https://doi.org/10.1049/ipr2.12134
spellingShingle	Chao Zhou Can Chen Fei Ding Dengyin Zhang Video super‐resolution with non‐local alignment network IET Image Processing
title	Video super‐resolution with non‐local alignment network
title_full	Video super‐resolution with non‐local alignment network
title_fullStr	Video super‐resolution with non‐local alignment network
title_full_unstemmed	Video super‐resolution with non‐local alignment network
title_short	Video super‐resolution with non‐local alignment network
title_sort	video super resolution with non local alignment network
url	https://doi.org/10.1049/ipr2.12134
work_keys_str_mv	AT chaozhou videosuperresolutionwithnonlocalalignmentnetwork AT canchen videosuperresolutionwithnonlocalalignmentnetwork AT feiding videosuperresolutionwithnonlocalalignmentnetwork AT dengyinzhang videosuperresolutionwithnonlocalalignmentnetwork

Video super‐resolution with non‐local alignment network

Similar Items