Concentration or distraction? A synergetic-based attention weights optimization method

Abstract The attention mechanism empowers deep learning to a broader range of applications, but the contribution of the attention module is highly controversial. Research on modern Hopfield networks indicates that the attention mechanism can also be used in shallow networks. Its automatic sample fil...

Full description

Bibliographic Details
Main Authors:	Zihao Wang, Haifeng Li, Lin Ma, Feng Jiang
Format:	Article
Language:	English
Published:	Springer 2023-06-01
Series:	Complex & Intelligent Systems
Subjects:	Attention Synergetic neural network Recurrent neural network Multiple instance learning Shallow network
Online Access:	https://doi.org/10.1007/s40747-023-01133-0

_version_	1797647121603624960
author	Zihao Wang Haifeng Li Lin Ma Feng Jiang
author_facet	Zihao Wang Haifeng Li Lin Ma Feng Jiang
author_sort	Zihao Wang
collection	DOAJ
description	Abstract The attention mechanism empowers deep learning to a broader range of applications, but the contribution of the attention module is highly controversial. Research on modern Hopfield networks indicates that the attention mechanism can also be used in shallow networks. Its automatic sample filtering facilitates instance extraction in Multiple Instances Learning tasks. Since the attention mechanism has a clear contribution and intuitive performance in shallow networks, this paper further investigates its optimization method based on the recurrent neural network. Through comprehensive comparison, we find that the Synergetic Neural Network has the advantage of more accurate and controllable convergences and revertible converging steps. Therefore, we design the Syn layer based on the Synergetic Neural Network and propose the novel invertible activation function as the forward and backward update formula for attention weights concentration or distraction. Experimental results show that our method outperforms other methods in all Multiple Instances Learning benchmark datasets. Concentration improves the robustness of the results, while distraction expands the instance observing space and yields better results. Codes available at https://github.com/wzh134/Syn .
first_indexed	2024-03-11T15:11:46Z
format	Article
id	doaj.art-6b11ad529094460eadd89bd74ec8866a
institution	Directory Open Access Journal
issn	2199-4536 2198-6053
language	English
last_indexed	2024-03-11T15:11:46Z
publishDate	2023-06-01
publisher	Springer
record_format	Article
series	Complex & Intelligent Systems
spelling	doaj.art-6b11ad529094460eadd89bd74ec8866a2023-10-29T12:41:10ZengSpringerComplex & Intelligent Systems2199-45362198-60532023-06-01967381739310.1007/s40747-023-01133-0Concentration or distraction? A synergetic-based attention weights optimization methodZihao Wang0Haifeng Li1Lin Ma2Feng Jiang3Faculty of Computing, Harbin Institute of TechnologyFaculty of Computing, Harbin Institute of TechnologyFaculty of Computing, Harbin Institute of TechnologySchool of Medicine and Health, Harbin Institute of TechnologyAbstract The attention mechanism empowers deep learning to a broader range of applications, but the contribution of the attention module is highly controversial. Research on modern Hopfield networks indicates that the attention mechanism can also be used in shallow networks. Its automatic sample filtering facilitates instance extraction in Multiple Instances Learning tasks. Since the attention mechanism has a clear contribution and intuitive performance in shallow networks, this paper further investigates its optimization method based on the recurrent neural network. Through comprehensive comparison, we find that the Synergetic Neural Network has the advantage of more accurate and controllable convergences and revertible converging steps. Therefore, we design the Syn layer based on the Synergetic Neural Network and propose the novel invertible activation function as the forward and backward update formula for attention weights concentration or distraction. Experimental results show that our method outperforms other methods in all Multiple Instances Learning benchmark datasets. Concentration improves the robustness of the results, while distraction expands the instance observing space and yields better results. Codes available at https://github.com/wzh134/Syn .https://doi.org/10.1007/s40747-023-01133-0AttentionSynergetic neural networkRecurrent neural networkMultiple instance learningShallow network
spellingShingle	Zihao Wang Haifeng Li Lin Ma Feng Jiang Concentration or distraction? A synergetic-based attention weights optimization method Complex & Intelligent Systems Attention Synergetic neural network Recurrent neural network Multiple instance learning Shallow network
title	Concentration or distraction? A synergetic-based attention weights optimization method
title_full	Concentration or distraction? A synergetic-based attention weights optimization method
title_fullStr	Concentration or distraction? A synergetic-based attention weights optimization method
title_full_unstemmed	Concentration or distraction? A synergetic-based attention weights optimization method
title_short	Concentration or distraction? A synergetic-based attention weights optimization method
title_sort	concentration or distraction a synergetic based attention weights optimization method
topic	Attention Synergetic neural network Recurrent neural network Multiple instance learning Shallow network
url	https://doi.org/10.1007/s40747-023-01133-0
work_keys_str_mv	AT zihaowang concentrationordistractionasynergeticbasedattentionweightsoptimizationmethod AT haifengli concentrationordistractionasynergeticbasedattentionweightsoptimizationmethod AT linma concentrationordistractionasynergeticbasedattentionweightsoptimizationmethod AT fengjiang concentrationordistractionasynergeticbasedattentionweightsoptimizationmethod

Concentration or distraction? A synergetic-based attention weights optimization method

Similar Items