iSeeBetter: Spatio-temporal video super-resolution using recurrent generative back-projection networks

Abstract Recently, learning-based models have enhanced the performance of single-image super-resolution (SISR). However, applying SISR successively to each video frame leads to a lack of temporal coherency. Convolutional neural networks (CNNs) outperform traditional approaches in terms of image qual...

Full description

Bibliographic Details
Main Authors: Aman Chadha, John Britto, M. Mani Roja
Format: Article
Language:English
Published: SpringerOpen 2020-07-01
Series:Computational Visual Media
Subjects:
Online Access:https://doi.org/10.1007/s41095-020-0175-7
_version_ 1818619706065551360
author Aman Chadha
John Britto
M. Mani Roja
author_facet Aman Chadha
John Britto
M. Mani Roja
author_sort Aman Chadha
collection DOAJ
description Abstract Recently, learning-based models have enhanced the performance of single-image super-resolution (SISR). However, applying SISR successively to each video frame leads to a lack of temporal coherency. Convolutional neural networks (CNNs) outperform traditional approaches in terms of image quality metrics such as peak signal to noise ratio (PSNR) and structural similarity (SSIM). On the other hand, generative adversarial networks (GANs) offer a competitive advantage by being able to mitigate the issue of a lack of finer texture details, usually seen with CNNs when super-resolving at large upscaling factors. We present iSeeBetter, a novel GAN-based spatio-temporal approach to video super-resolution (VSR) that renders temporally consistent super-resolution videos. iSeeBetter extracts spatial and temporal information from the current and neighboring frames using the concept of recurrent back-projection networks as its generator. Furthermore, to improve the “naturality” of the super-resolved output while eliminating artifacts seen with traditional algorithms, we utilize the discriminator from super-resolution generative adversarial network. Although mean squared error (MSE) as a primary loss-minimization objective improves PSNR/SSIM, these metrics may not capture fine details in the image resulting in misrepresentation of perceptual quality. To address this, we use a four-fold (MSE, perceptual, adversarial, and total-variation loss function. Our results demonstrate that iSeeBetter offers superior VSR fidelity and surpasses state-of-the-art performance.
first_indexed 2024-12-16T17:41:44Z
format Article
id doaj.art-df23ae14f18d4ecb86a80066b48763dd
institution Directory Open Access Journal
issn 2096-0433
2096-0662
language English
last_indexed 2024-12-16T17:41:44Z
publishDate 2020-07-01
publisher SpringerOpen
record_format Article
series Computational Visual Media
spelling doaj.art-df23ae14f18d4ecb86a80066b48763dd2022-12-21T22:22:35ZengSpringerOpenComputational Visual Media2096-04332096-06622020-07-016330731710.1007/s41095-020-0175-7iSeeBetter: Spatio-temporal video super-resolution using recurrent generative back-projection networksAman Chadha0John Britto1M. Mani Roja2Department of Computer Science, Stanford UniversityDepartment of Computer Science, University of Massachusetts AmherstDepartment of Electronics and Telecommunication Engineering, University of MumbaiAbstract Recently, learning-based models have enhanced the performance of single-image super-resolution (SISR). However, applying SISR successively to each video frame leads to a lack of temporal coherency. Convolutional neural networks (CNNs) outperform traditional approaches in terms of image quality metrics such as peak signal to noise ratio (PSNR) and structural similarity (SSIM). On the other hand, generative adversarial networks (GANs) offer a competitive advantage by being able to mitigate the issue of a lack of finer texture details, usually seen with CNNs when super-resolving at large upscaling factors. We present iSeeBetter, a novel GAN-based spatio-temporal approach to video super-resolution (VSR) that renders temporally consistent super-resolution videos. iSeeBetter extracts spatial and temporal information from the current and neighboring frames using the concept of recurrent back-projection networks as its generator. Furthermore, to improve the “naturality” of the super-resolved output while eliminating artifacts seen with traditional algorithms, we utilize the discriminator from super-resolution generative adversarial network. Although mean squared error (MSE) as a primary loss-minimization objective improves PSNR/SSIM, these metrics may not capture fine details in the image resulting in misrepresentation of perceptual quality. To address this, we use a four-fold (MSE, perceptual, adversarial, and total-variation loss function. Our results demonstrate that iSeeBetter offers superior VSR fidelity and surpasses state-of-the-art performance.https://doi.org/10.1007/s41095-020-0175-7super resolutionvideo upscalingframe recurrenceoptical flowgenerative adversarial networksconvolutional neural networks
spellingShingle Aman Chadha
John Britto
M. Mani Roja
iSeeBetter: Spatio-temporal video super-resolution using recurrent generative back-projection networks
Computational Visual Media
super resolution
video upscaling
frame recurrence
optical flow
generative adversarial networks
convolutional neural networks
title iSeeBetter: Spatio-temporal video super-resolution using recurrent generative back-projection networks
title_full iSeeBetter: Spatio-temporal video super-resolution using recurrent generative back-projection networks
title_fullStr iSeeBetter: Spatio-temporal video super-resolution using recurrent generative back-projection networks
title_full_unstemmed iSeeBetter: Spatio-temporal video super-resolution using recurrent generative back-projection networks
title_short iSeeBetter: Spatio-temporal video super-resolution using recurrent generative back-projection networks
title_sort iseebetter spatio temporal video super resolution using recurrent generative back projection networks
topic super resolution
video upscaling
frame recurrence
optical flow
generative adversarial networks
convolutional neural networks
url https://doi.org/10.1007/s41095-020-0175-7
work_keys_str_mv AT amanchadha iseebetterspatiotemporalvideosuperresolutionusingrecurrentgenerativebackprojectionnetworks
AT johnbritto iseebetterspatiotemporalvideosuperresolutionusingrecurrentgenerativebackprojectionnetworks
AT mmaniroja iseebetterspatiotemporalvideosuperresolutionusingrecurrentgenerativebackprojectionnetworks