DeepRare: Generic Unsupervised Visual Attention Models
Visual attention selects data considered as “interesting” by humans, and it is modeled in the field of engineering by feature-engineered methods finding contrasted/surprising/unusual image data. Deep learning drastically improved the models efficiency on the main benchmark datasets. However, Deep Ne...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2022-05-01
|
Series: | Electronics |
Subjects: | |
Online Access: | https://www.mdpi.com/2079-9292/11/11/1696 |
_version_ | 1797493697372225536 |
---|---|
author | Phutphalla Kong Matei Mancas Bernard Gosselin Kimtho Po |
author_facet | Phutphalla Kong Matei Mancas Bernard Gosselin Kimtho Po |
author_sort | Phutphalla Kong |
collection | DOAJ |
description | Visual attention selects data considered as “interesting” by humans, and it is modeled in the field of engineering by feature-engineered methods finding contrasted/surprising/unusual image data. Deep learning drastically improved the models efficiency on the main benchmark datasets. However, Deep Neural Networks-based (DNN-based) models are counterintuitive: surprising or unusual data are by definition difficult to learn because of their low occurrence probability. In reality, DNN-based models mainly learn top-down features such as faces, text, people, or animals which usually attract human attention, but they have low efficiency in extracting surprising or unusual data in the images. In this article, we propose a new family of visual attention models called DeepRare and especially DeepRare2021 (<b>DR21</b>), which uses the power of DNNs’ feature extraction and the genericity of feature-engineered algorithms. This algorithm is an evolution of a previous version called DeepRare2019 (<b>DR19</b>) based on this common framework. <b>DR21</b> (1) does not need any additional training other than the default ImageNet training, (2) is fast even on CPU, (3) is tested on four very different eye-tracking datasets showing that <b>DR21</b> is generic and is always within the top models on all datasets and metrics while no other model exhibits such a regularity and genericity. Finally, <b>DR21</b> (4) is tested with several network architectures such as VGG16 (V16), VGG19 (V19), and MobileNetV2 (MN2), and (5) it provides explanation and transparency on which parts of the image are the most surprising at different levels despite the use of a DNN-based feature extractor. |
first_indexed | 2024-03-10T01:23:46Z |
format | Article |
id | doaj.art-526a15ab9f244a4d89e856cdc4614021 |
institution | Directory Open Access Journal |
issn | 2079-9292 |
language | English |
last_indexed | 2024-03-10T01:23:46Z |
publishDate | 2022-05-01 |
publisher | MDPI AG |
record_format | Article |
series | Electronics |
spelling | doaj.art-526a15ab9f244a4d89e856cdc46140212023-11-23T13:54:23ZengMDPI AGElectronics2079-92922022-05-011111169610.3390/electronics11111696DeepRare: Generic Unsupervised Visual Attention ModelsPhutphalla Kong0Matei Mancas1Bernard Gosselin2Kimtho Po3Institute of Technology of Cambodia (ITC), Russian Conf. Blvd., Phnom Penh P.O. Box 86, CambodiaNumediart Institute, University of Mons (UMONS), 31, Bd. Dolez, 7000 Mons, BelgiumNumediart Institute, University of Mons (UMONS), 31, Bd. Dolez, 7000 Mons, BelgiumInstitute of Technology of Cambodia (ITC), Russian Conf. Blvd., Phnom Penh P.O. Box 86, CambodiaVisual attention selects data considered as “interesting” by humans, and it is modeled in the field of engineering by feature-engineered methods finding contrasted/surprising/unusual image data. Deep learning drastically improved the models efficiency on the main benchmark datasets. However, Deep Neural Networks-based (DNN-based) models are counterintuitive: surprising or unusual data are by definition difficult to learn because of their low occurrence probability. In reality, DNN-based models mainly learn top-down features such as faces, text, people, or animals which usually attract human attention, but they have low efficiency in extracting surprising or unusual data in the images. In this article, we propose a new family of visual attention models called DeepRare and especially DeepRare2021 (<b>DR21</b>), which uses the power of DNNs’ feature extraction and the genericity of feature-engineered algorithms. This algorithm is an evolution of a previous version called DeepRare2019 (<b>DR19</b>) based on this common framework. <b>DR21</b> (1) does not need any additional training other than the default ImageNet training, (2) is fast even on CPU, (3) is tested on four very different eye-tracking datasets showing that <b>DR21</b> is generic and is always within the top models on all datasets and metrics while no other model exhibits such a regularity and genericity. Finally, <b>DR21</b> (4) is tested with several network architectures such as VGG16 (V16), VGG19 (V19), and MobileNetV2 (MN2), and (5) it provides explanation and transparency on which parts of the image are the most surprising at different levels despite the use of a DNN-based feature extractor.https://www.mdpi.com/2079-9292/11/11/1696eye trackingdeep featuresodd one outraritysaliencyvisual attention prediction |
spellingShingle | Phutphalla Kong Matei Mancas Bernard Gosselin Kimtho Po DeepRare: Generic Unsupervised Visual Attention Models Electronics eye tracking deep features odd one out rarity saliency visual attention prediction |
title | DeepRare: Generic Unsupervised Visual Attention Models |
title_full | DeepRare: Generic Unsupervised Visual Attention Models |
title_fullStr | DeepRare: Generic Unsupervised Visual Attention Models |
title_full_unstemmed | DeepRare: Generic Unsupervised Visual Attention Models |
title_short | DeepRare: Generic Unsupervised Visual Attention Models |
title_sort | deeprare generic unsupervised visual attention models |
topic | eye tracking deep features odd one out rarity saliency visual attention prediction |
url | https://www.mdpi.com/2079-9292/11/11/1696 |
work_keys_str_mv | AT phutphallakong deepraregenericunsupervisedvisualattentionmodels AT mateimancas deepraregenericunsupervisedvisualattentionmodels AT bernardgosselin deepraregenericunsupervisedvisualattentionmodels AT kimthopo deepraregenericunsupervisedvisualattentionmodels |