Visual number sense for real-world scenes shared by deep neural networks and humans

Recently, visual number sense has been identified from deep neural networks (DNNs). However, whether DNNs have the same capacity for real-world scenes, rather than the simple geometric figures that are often tested, is unclear. In this study, we explore the number perception of scenes using AlexNet...

Full description

Bibliographic Details
Main Authors: Wu Wencheng, Yingxi Ge, Zhentao Zuo, Lin Chen, Xu Qin, Liu Zuxiang
Format: Article
Language:English
Published: Elsevier 2023-08-01
Series:Heliyon
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2405844023057250
Description
Summary:Recently, visual number sense has been identified from deep neural networks (DNNs). However, whether DNNs have the same capacity for real-world scenes, rather than the simple geometric figures that are often tested, is unclear. In this study, we explore the number perception of scenes using AlexNet and find that numerosity can be represented by the pattern of group activation of the category layer units. The global activation of these units increases with the number of objects in the scene, and the variations in their activation decrease accordingly. By decoding the numerosity from this pattern, we reveal that the embedding coefficient of a scene determines the likelihood of potential objects to contribute to numerical perception. This was demonstrated by the more optimized performance for pictures with relatively high embedding coefficients in both DNNs and humans. This study for the first time shows that a distinct feature in visual environments, revealed by DNNs, can modulate human perception, supported by a group-coding mechanism.
ISSN:2405-8440