Concentration or distraction? A synergetic-based attention weights optimization method

Abstract The attention mechanism empowers deep learning to a broader range of applications, but the contribution of the attention module is highly controversial. Research on modern Hopfield networks indicates that the attention mechanism can also be used in shallow networks. Its automatic sample fil...

Full description

Bibliographic Details
Main Authors: Zihao Wang, Haifeng Li, Lin Ma, Feng Jiang
Format: Article
Language:English
Published: Springer 2023-06-01
Series:Complex & Intelligent Systems
Subjects:
Online Access:https://doi.org/10.1007/s40747-023-01133-0
_version_ 1797647121603624960
author Zihao Wang
Haifeng Li
Lin Ma
Feng Jiang
author_facet Zihao Wang
Haifeng Li
Lin Ma
Feng Jiang
author_sort Zihao Wang
collection DOAJ
description Abstract The attention mechanism empowers deep learning to a broader range of applications, but the contribution of the attention module is highly controversial. Research on modern Hopfield networks indicates that the attention mechanism can also be used in shallow networks. Its automatic sample filtering facilitates instance extraction in Multiple Instances Learning tasks. Since the attention mechanism has a clear contribution and intuitive performance in shallow networks, this paper further investigates its optimization method based on the recurrent neural network. Through comprehensive comparison, we find that the Synergetic Neural Network has the advantage of more accurate and controllable convergences and revertible converging steps. Therefore, we design the Syn layer based on the Synergetic Neural Network and propose the novel invertible activation function as the forward and backward update formula for attention weights concentration or distraction. Experimental results show that our method outperforms other methods in all Multiple Instances Learning benchmark datasets. Concentration improves the robustness of the results, while distraction expands the instance observing space and yields better results. Codes available at https://github.com/wzh134/Syn .
first_indexed 2024-03-11T15:11:46Z
format Article
id doaj.art-6b11ad529094460eadd89bd74ec8866a
institution Directory Open Access Journal
issn 2199-4536
2198-6053
language English
last_indexed 2024-03-11T15:11:46Z
publishDate 2023-06-01
publisher Springer
record_format Article
series Complex & Intelligent Systems
spelling doaj.art-6b11ad529094460eadd89bd74ec8866a2023-10-29T12:41:10ZengSpringerComplex & Intelligent Systems2199-45362198-60532023-06-01967381739310.1007/s40747-023-01133-0Concentration or distraction? A synergetic-based attention weights optimization methodZihao Wang0Haifeng Li1Lin Ma2Feng Jiang3Faculty of Computing, Harbin Institute of TechnologyFaculty of Computing, Harbin Institute of TechnologyFaculty of Computing, Harbin Institute of TechnologySchool of Medicine and Health, Harbin Institute of TechnologyAbstract The attention mechanism empowers deep learning to a broader range of applications, but the contribution of the attention module is highly controversial. Research on modern Hopfield networks indicates that the attention mechanism can also be used in shallow networks. Its automatic sample filtering facilitates instance extraction in Multiple Instances Learning tasks. Since the attention mechanism has a clear contribution and intuitive performance in shallow networks, this paper further investigates its optimization method based on the recurrent neural network. Through comprehensive comparison, we find that the Synergetic Neural Network has the advantage of more accurate and controllable convergences and revertible converging steps. Therefore, we design the Syn layer based on the Synergetic Neural Network and propose the novel invertible activation function as the forward and backward update formula for attention weights concentration or distraction. Experimental results show that our method outperforms other methods in all Multiple Instances Learning benchmark datasets. Concentration improves the robustness of the results, while distraction expands the instance observing space and yields better results. Codes available at https://github.com/wzh134/Syn .https://doi.org/10.1007/s40747-023-01133-0AttentionSynergetic neural networkRecurrent neural networkMultiple instance learningShallow network
spellingShingle Zihao Wang
Haifeng Li
Lin Ma
Feng Jiang
Concentration or distraction? A synergetic-based attention weights optimization method
Complex & Intelligent Systems
Attention
Synergetic neural network
Recurrent neural network
Multiple instance learning
Shallow network
title Concentration or distraction? A synergetic-based attention weights optimization method
title_full Concentration or distraction? A synergetic-based attention weights optimization method
title_fullStr Concentration or distraction? A synergetic-based attention weights optimization method
title_full_unstemmed Concentration or distraction? A synergetic-based attention weights optimization method
title_short Concentration or distraction? A synergetic-based attention weights optimization method
title_sort concentration or distraction a synergetic based attention weights optimization method
topic Attention
Synergetic neural network
Recurrent neural network
Multiple instance learning
Shallow network
url https://doi.org/10.1007/s40747-023-01133-0
work_keys_str_mv AT zihaowang concentrationordistractionasynergeticbasedattentionweightsoptimizationmethod
AT haifengli concentrationordistractionasynergeticbasedattentionweightsoptimizationmethod
AT linma concentrationordistractionasynergeticbasedattentionweightsoptimizationmethod
AT fengjiang concentrationordistractionasynergeticbasedattentionweightsoptimizationmethod