DeepRare: Generic Unsupervised Visual Attention Models

Visual attention selects data considered as “interesting” by humans, and it is modeled in the field of engineering by feature-engineered methods finding contrasted/surprising/unusual image data. Deep learning drastically improved the models efficiency on the main benchmark datasets. However, Deep Ne...

Full description

Bibliographic Details
Main Authors: Phutphalla Kong, Matei Mancas, Bernard Gosselin, Kimtho Po
Format: Article
Language:English
Published: MDPI AG 2022-05-01
Series:Electronics
Subjects:
Online Access:https://www.mdpi.com/2079-9292/11/11/1696
_version_ 1797493697372225536
author Phutphalla Kong
Matei Mancas
Bernard Gosselin
Kimtho Po
author_facet Phutphalla Kong
Matei Mancas
Bernard Gosselin
Kimtho Po
author_sort Phutphalla Kong
collection DOAJ
description Visual attention selects data considered as “interesting” by humans, and it is modeled in the field of engineering by feature-engineered methods finding contrasted/surprising/unusual image data. Deep learning drastically improved the models efficiency on the main benchmark datasets. However, Deep Neural Networks-based (DNN-based) models are counterintuitive: surprising or unusual data are by definition difficult to learn because of their low occurrence probability. In reality, DNN-based models mainly learn top-down features such as faces, text, people, or animals which usually attract human attention, but they have low efficiency in extracting surprising or unusual data in the images. In this article, we propose a new family of visual attention models called DeepRare and especially DeepRare2021 (<b>DR21</b>), which uses the power of DNNs’ feature extraction and the genericity of feature-engineered algorithms. This algorithm is an evolution of a previous version called DeepRare2019 (<b>DR19</b>) based on this common framework. <b>DR21</b> (1) does not need any additional training other than the default ImageNet training, (2) is fast even on CPU, (3) is tested on four very different eye-tracking datasets showing that <b>DR21</b> is generic and is always within the top models on all datasets and metrics while no other model exhibits such a regularity and genericity. Finally, <b>DR21</b> (4) is tested with several network architectures such as VGG16 (V16), VGG19 (V19), and MobileNetV2 (MN2), and (5) it provides explanation and transparency on which parts of the image are the most surprising at different levels despite the use of a DNN-based feature extractor.
first_indexed 2024-03-10T01:23:46Z
format Article
id doaj.art-526a15ab9f244a4d89e856cdc4614021
institution Directory Open Access Journal
issn 2079-9292
language English
last_indexed 2024-03-10T01:23:46Z
publishDate 2022-05-01
publisher MDPI AG
record_format Article
series Electronics
spelling doaj.art-526a15ab9f244a4d89e856cdc46140212023-11-23T13:54:23ZengMDPI AGElectronics2079-92922022-05-011111169610.3390/electronics11111696DeepRare: Generic Unsupervised Visual Attention ModelsPhutphalla Kong0Matei Mancas1Bernard Gosselin2Kimtho Po3Institute of Technology of Cambodia (ITC), Russian Conf. Blvd., Phnom Penh P.O. Box 86, CambodiaNumediart Institute, University of Mons (UMONS), 31, Bd. Dolez, 7000 Mons, BelgiumNumediart Institute, University of Mons (UMONS), 31, Bd. Dolez, 7000 Mons, BelgiumInstitute of Technology of Cambodia (ITC), Russian Conf. Blvd., Phnom Penh P.O. Box 86, CambodiaVisual attention selects data considered as “interesting” by humans, and it is modeled in the field of engineering by feature-engineered methods finding contrasted/surprising/unusual image data. Deep learning drastically improved the models efficiency on the main benchmark datasets. However, Deep Neural Networks-based (DNN-based) models are counterintuitive: surprising or unusual data are by definition difficult to learn because of their low occurrence probability. In reality, DNN-based models mainly learn top-down features such as faces, text, people, or animals which usually attract human attention, but they have low efficiency in extracting surprising or unusual data in the images. In this article, we propose a new family of visual attention models called DeepRare and especially DeepRare2021 (<b>DR21</b>), which uses the power of DNNs’ feature extraction and the genericity of feature-engineered algorithms. This algorithm is an evolution of a previous version called DeepRare2019 (<b>DR19</b>) based on this common framework. <b>DR21</b> (1) does not need any additional training other than the default ImageNet training, (2) is fast even on CPU, (3) is tested on four very different eye-tracking datasets showing that <b>DR21</b> is generic and is always within the top models on all datasets and metrics while no other model exhibits such a regularity and genericity. Finally, <b>DR21</b> (4) is tested with several network architectures such as VGG16 (V16), VGG19 (V19), and MobileNetV2 (MN2), and (5) it provides explanation and transparency on which parts of the image are the most surprising at different levels despite the use of a DNN-based feature extractor.https://www.mdpi.com/2079-9292/11/11/1696eye trackingdeep featuresodd one outraritysaliencyvisual attention prediction
spellingShingle Phutphalla Kong
Matei Mancas
Bernard Gosselin
Kimtho Po
DeepRare: Generic Unsupervised Visual Attention Models
Electronics
eye tracking
deep features
odd one out
rarity
saliency
visual attention prediction
title DeepRare: Generic Unsupervised Visual Attention Models
title_full DeepRare: Generic Unsupervised Visual Attention Models
title_fullStr DeepRare: Generic Unsupervised Visual Attention Models
title_full_unstemmed DeepRare: Generic Unsupervised Visual Attention Models
title_short DeepRare: Generic Unsupervised Visual Attention Models
title_sort deeprare generic unsupervised visual attention models
topic eye tracking
deep features
odd one out
rarity
saliency
visual attention prediction
url https://www.mdpi.com/2079-9292/11/11/1696
work_keys_str_mv AT phutphallakong deepraregenericunsupervisedvisualattentionmodels
AT mateimancas deepraregenericunsupervisedvisualattentionmodels
AT bernardgosselin deepraregenericunsupervisedvisualattentionmodels
AT kimthopo deepraregenericunsupervisedvisualattentionmodels