Blind video quality prediction by uncovering human video perceptual representation

Blind video quality assessment (VQA) has become an increasingly demanding problem in automatically assessing the quality of ever-growing in-the-wild videos. Although efforts have been made to measure temporal distortions, the core to distinguish between VQA and image quality assessment (IQA), the la...

Full description

Bibliographic Details
Main Authors: Liao, Liang, Xu, Kangmin, Wu, Haoning, Chen, Chaofeng, Sun, Wenxiu, Yan, Qiong, Kuo, Jay C.-C., Lin, Weisi
Other Authors: School of Computer Science and Engineering
Format: Journal Article
Language:English
Published: 2024
Subjects:
Online Access:https://hdl.handle.net/10356/181046
_version_ 1826120520618213376
author Liao, Liang
Xu, Kangmin
Wu, Haoning
Chen, Chaofeng
Sun, Wenxiu
Yan, Qiong
Kuo, Jay C.-C.
Lin, Weisi
author2 School of Computer Science and Engineering
author_facet School of Computer Science and Engineering
Liao, Liang
Xu, Kangmin
Wu, Haoning
Chen, Chaofeng
Sun, Wenxiu
Yan, Qiong
Kuo, Jay C.-C.
Lin, Weisi
author_sort Liao, Liang
collection NTU
description Blind video quality assessment (VQA) has become an increasingly demanding problem in automatically assessing the quality of ever-growing in-the-wild videos. Although efforts have been made to measure temporal distortions, the core to distinguish between VQA and image quality assessment (IQA), the lack of modeling of how the human visual system (HVS) relates to the temporal quality of videos hinders the precise mapping of predicted temporal scores to the human perception. Inspired by the recent discovery of the temporal straightness law of natural videos in the HVS, this paper intends to model the complex temporal distortions of in-the-wild videos in a simple and uniform representation by describing the geometric properties of videos in the visual perceptual domain. A novel videolet, with perceptual representation embedding of a few consecutive frames, is designed as the basic quality measurement unit to quantify temporal distortions by measuring the angular and linear displacements from the straightness law. By combining the predicted score on each videolet, a perceptually temporal quality evaluator (PTQE) is formed to measure the temporal quality of the entire video. Experimental results demonstrate that the perceptual representation in the HVS is an efficient way of predicting subjective temporal quality. Moreover, when combined with spatial quality metrics, PTQE achieves top performance over popular in-the-wild video datasets. More importantly, PTQE requires no additional information beyond the video being assessed, making it applicable to any dataset without parameter tuning. Additionally, the generalizability of PTQE is evaluated on video frame interpolation tasks, demonstrating its potential to benefit temporal-related enhancement tasks.
first_indexed 2025-03-09T12:43:59Z
format Journal Article
id ntu-10356/181046
institution Nanyang Technological University
language English
last_indexed 2025-03-09T12:43:59Z
publishDate 2024
record_format dspace
spelling ntu-10356/1810462024-11-12T05:23:18Z Blind video quality prediction by uncovering human video perceptual representation Liao, Liang Xu, Kangmin Wu, Haoning Chen, Chaofeng Sun, Wenxiu Yan, Qiong Kuo, Jay C.-C. Lin, Weisi School of Computer Science and Engineering Computer and Information Science Video quality assessment Human visual system Blind video quality assessment (VQA) has become an increasingly demanding problem in automatically assessing the quality of ever-growing in-the-wild videos. Although efforts have been made to measure temporal distortions, the core to distinguish between VQA and image quality assessment (IQA), the lack of modeling of how the human visual system (HVS) relates to the temporal quality of videos hinders the precise mapping of predicted temporal scores to the human perception. Inspired by the recent discovery of the temporal straightness law of natural videos in the HVS, this paper intends to model the complex temporal distortions of in-the-wild videos in a simple and uniform representation by describing the geometric properties of videos in the visual perceptual domain. A novel videolet, with perceptual representation embedding of a few consecutive frames, is designed as the basic quality measurement unit to quantify temporal distortions by measuring the angular and linear displacements from the straightness law. By combining the predicted score on each videolet, a perceptually temporal quality evaluator (PTQE) is formed to measure the temporal quality of the entire video. Experimental results demonstrate that the perceptual representation in the HVS is an efficient way of predicting subjective temporal quality. Moreover, when combined with spatial quality metrics, PTQE achieves top performance over popular in-the-wild video datasets. More importantly, PTQE requires no additional information beyond the video being assessed, making it applicable to any dataset without parameter tuning. Additionally, the generalizability of PTQE is evaluated on video frame interpolation tasks, demonstrating its potential to benefit temporal-related enhancement tasks. Agency for Science, Technology and Research (A*STAR) This work was supported in part under the RIE2020 Industry Alignment Fund–Industry Collaboration Projects (IAF-ICP) Funding Initiative, in part by the National Natural Science Foundation of China under Grant 62202349, and in part by the Young Elite Scientists Sponsorship Program by CAST under Grant 2023QNRC001. 2024-11-12T05:23:18Z 2024-11-12T05:23:18Z 2024 Journal Article Liao, L., Xu, K., Wu, H., Chen, C., Sun, W., Yan, Q., Kuo, J. C. & Lin, W. (2024). Blind video quality prediction by uncovering human video perceptual representation. IEEE Transactions On Image Processing, 33, 4998-5013. https://dx.doi.org/10.1109/TIP.2024.3445738 1941-0042 https://hdl.handle.net/10356/181046 10.1109/TIP.2024.3445738 39236121 2-s2.0-85203558671 33 4998 5013 en IAF-ICP IEEE Transactions on Image Processing © 2024 IEEE. All rights reserved.
spellingShingle Computer and Information Science
Video quality assessment
Human visual system
Liao, Liang
Xu, Kangmin
Wu, Haoning
Chen, Chaofeng
Sun, Wenxiu
Yan, Qiong
Kuo, Jay C.-C.
Lin, Weisi
Blind video quality prediction by uncovering human video perceptual representation
title Blind video quality prediction by uncovering human video perceptual representation
title_full Blind video quality prediction by uncovering human video perceptual representation
title_fullStr Blind video quality prediction by uncovering human video perceptual representation
title_full_unstemmed Blind video quality prediction by uncovering human video perceptual representation
title_short Blind video quality prediction by uncovering human video perceptual representation
title_sort blind video quality prediction by uncovering human video perceptual representation
topic Computer and Information Science
Video quality assessment
Human visual system
url https://hdl.handle.net/10356/181046
work_keys_str_mv AT liaoliang blindvideoqualitypredictionbyuncoveringhumanvideoperceptualrepresentation
AT xukangmin blindvideoqualitypredictionbyuncoveringhumanvideoperceptualrepresentation
AT wuhaoning blindvideoqualitypredictionbyuncoveringhumanvideoperceptualrepresentation
AT chenchaofeng blindvideoqualitypredictionbyuncoveringhumanvideoperceptualrepresentation
AT sunwenxiu blindvideoqualitypredictionbyuncoveringhumanvideoperceptualrepresentation
AT yanqiong blindvideoqualitypredictionbyuncoveringhumanvideoperceptualrepresentation
AT kuojaycc blindvideoqualitypredictionbyuncoveringhumanvideoperceptualrepresentation
AT linweisi blindvideoqualitypredictionbyuncoveringhumanvideoperceptualrepresentation