No-Reference Video Quality Assessment Using Distortion Learning and Temporal Attention

The rapid growth of video consumption and multimedia applications has increased the interest of the academia and industry in building tools that can evaluate perceptual video quality. Since videos might be distorted when they are captured or transmitted, it is imperative to develop reliable methods...

Full description

Bibliographic Details
Main Authors:	Koffi Kossi, Stephane Coulombe, Christian Desrosiers, Ghyslain Gagnon
Format:	Article
Language:	English
Published:	IEEE 2022-01-01
Series:	IEEE Access
Subjects:	Video quality assessment no reference transfer learning multi-task learning attention mechanism authentic distortion
Online Access:	https://ieeexplore.ieee.org/document/9757199/

_version_	1828850829740736512
author	Koffi Kossi Stephane Coulombe Christian Desrosiers Ghyslain Gagnon
author_facet	Koffi Kossi Stephane Coulombe Christian Desrosiers Ghyslain Gagnon
author_sort	Koffi Kossi
collection	DOAJ
description	The rapid growth of video consumption and multimedia applications has increased the interest of the academia and industry in building tools that can evaluate perceptual video quality. Since videos might be distorted when they are captured or transmitted, it is imperative to develop reliable methods for no-reference video quality assessment (NR-VQA). To date, most NR-VQA models in prior art have been proposed for assessing a specific category of distortion, such as authentic distortions or traditional distortions. Moreover, those developed for both authentic and traditional distortions video databases have so far led to poor performances. This resulted in the reluctance of service providers to adopt multiple NR-VQA approaches, as they prefer a single algorithm capable of accurately estimating video quality in all situations. Furthermore, many existing NR-VQA methods are computationally complex and therefore impractical for various real-life applications. In this paper, we propose a novel deep learning method for NR-VQA based on multi-task learning where the distortion of individual frames in a video and the overall quality of the video are predicted by a single neural network. This enables to train the network with a greater amount and variety of data, thereby improving its performance in testing. Additionally, our method leverages temporal attention to select the frames of a video sequence which contribute the most to its perceived quality. The proposed algorithm is evaluated on five publicly-available video quality assessment (VQA) databases containing traditional and authentic distortions. Results show that our method outperforms the state-of-the-art on traditional distortion databases such as LIVE VQA and CSIQ video, while also delivering competitive performance on databases containing authentic distortions such as KoNViD-1k, LIVE-Qualcomm and CVD2014.
first_indexed	2024-12-12T23:19:43Z
format	Article
id	doaj.art-0902ee41d16342ec85e21bfd4cc2350e
institution	Directory Open Access Journal
issn	2169-3536
language	English
last_indexed	2024-12-12T23:19:43Z
publishDate	2022-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj.art-0902ee41d16342ec85e21bfd4cc2350e2022-12-22T00:08:18ZengIEEEIEEE Access2169-35362022-01-0110410104102210.1109/ACCESS.2022.31674469757199No-Reference Video Quality Assessment Using Distortion Learning and Temporal AttentionKoffi Kossi0https://orcid.org/0000-0002-3570-7896Stephane Coulombe1https://orcid.org/0000-0003-4495-3906Christian Desrosiers2https://orcid.org/0000-0002-9162-9650Ghyslain Gagnon3https://orcid.org/0000-0001-9484-7218Department of Electrical Engineering, École de technologie supérieure, Université du Québec, Montreal, QC, CanadaDepartment of Software and IT Engineering, École de technologie supérieure, Université du Québec, Montreal, QC, CanadaDepartment of Software and IT Engineering, École de technologie supérieure, Université du Québec, Montreal, QC, CanadaDepartment of Electrical Engineering, École de technologie supérieure, Université du Québec, Montreal, QC, CanadaThe rapid growth of video consumption and multimedia applications has increased the interest of the academia and industry in building tools that can evaluate perceptual video quality. Since videos might be distorted when they are captured or transmitted, it is imperative to develop reliable methods for no-reference video quality assessment (NR-VQA). To date, most NR-VQA models in prior art have been proposed for assessing a specific category of distortion, such as authentic distortions or traditional distortions. Moreover, those developed for both authentic and traditional distortions video databases have so far led to poor performances. This resulted in the reluctance of service providers to adopt multiple NR-VQA approaches, as they prefer a single algorithm capable of accurately estimating video quality in all situations. Furthermore, many existing NR-VQA methods are computationally complex and therefore impractical for various real-life applications. In this paper, we propose a novel deep learning method for NR-VQA based on multi-task learning where the distortion of individual frames in a video and the overall quality of the video are predicted by a single neural network. This enables to train the network with a greater amount and variety of data, thereby improving its performance in testing. Additionally, our method leverages temporal attention to select the frames of a video sequence which contribute the most to its perceived quality. The proposed algorithm is evaluated on five publicly-available video quality assessment (VQA) databases containing traditional and authentic distortions. Results show that our method outperforms the state-of-the-art on traditional distortion databases such as LIVE VQA and CSIQ video, while also delivering competitive performance on databases containing authentic distortions such as KoNViD-1k, LIVE-Qualcomm and CVD2014.https://ieeexplore.ieee.org/document/9757199/Video quality assessmentno referencetransfer learningmulti-task learningattention mechanismauthentic distortion
spellingShingle	Koffi Kossi Stephane Coulombe Christian Desrosiers Ghyslain Gagnon No-Reference Video Quality Assessment Using Distortion Learning and Temporal Attention IEEE Access Video quality assessment no reference transfer learning multi-task learning attention mechanism authentic distortion
title	No-Reference Video Quality Assessment Using Distortion Learning and Temporal Attention
title_full	No-Reference Video Quality Assessment Using Distortion Learning and Temporal Attention
title_fullStr	No-Reference Video Quality Assessment Using Distortion Learning and Temporal Attention
title_full_unstemmed	No-Reference Video Quality Assessment Using Distortion Learning and Temporal Attention
title_short	No-Reference Video Quality Assessment Using Distortion Learning and Temporal Attention
title_sort	no reference video quality assessment using distortion learning and temporal attention
topic	Video quality assessment no reference transfer learning multi-task learning attention mechanism authentic distortion
url	https://ieeexplore.ieee.org/document/9757199/
work_keys_str_mv	AT koffikossi noreferencevideoqualityassessmentusingdistortionlearningandtemporalattention AT stephanecoulombe noreferencevideoqualityassessmentusingdistortionlearningandtemporalattention AT christiandesrosiers noreferencevideoqualityassessmentusingdistortionlearningandtemporalattention AT ghyslaingagnon noreferencevideoqualityassessmentusingdistortionlearningandtemporalattention

No-Reference Video Quality Assessment Using Distortion Learning and Temporal Attention

Similar Items