Concentration or distraction? A synergetic-based attention weights optimization method
Abstract The attention mechanism empowers deep learning to a broader range of applications, but the contribution of the attention module is highly controversial. Research on modern Hopfield networks indicates that the attention mechanism can also be used in shallow networks. Its automatic sample fil...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Springer
2023-06-01
|
Series: | Complex & Intelligent Systems |
Subjects: | |
Online Access: | https://doi.org/10.1007/s40747-023-01133-0 |
_version_ | 1797647121603624960 |
---|---|
author | Zihao Wang Haifeng Li Lin Ma Feng Jiang |
author_facet | Zihao Wang Haifeng Li Lin Ma Feng Jiang |
author_sort | Zihao Wang |
collection | DOAJ |
description | Abstract The attention mechanism empowers deep learning to a broader range of applications, but the contribution of the attention module is highly controversial. Research on modern Hopfield networks indicates that the attention mechanism can also be used in shallow networks. Its automatic sample filtering facilitates instance extraction in Multiple Instances Learning tasks. Since the attention mechanism has a clear contribution and intuitive performance in shallow networks, this paper further investigates its optimization method based on the recurrent neural network. Through comprehensive comparison, we find that the Synergetic Neural Network has the advantage of more accurate and controllable convergences and revertible converging steps. Therefore, we design the Syn layer based on the Synergetic Neural Network and propose the novel invertible activation function as the forward and backward update formula for attention weights concentration or distraction. Experimental results show that our method outperforms other methods in all Multiple Instances Learning benchmark datasets. Concentration improves the robustness of the results, while distraction expands the instance observing space and yields better results. Codes available at https://github.com/wzh134/Syn . |
first_indexed | 2024-03-11T15:11:46Z |
format | Article |
id | doaj.art-6b11ad529094460eadd89bd74ec8866a |
institution | Directory Open Access Journal |
issn | 2199-4536 2198-6053 |
language | English |
last_indexed | 2024-03-11T15:11:46Z |
publishDate | 2023-06-01 |
publisher | Springer |
record_format | Article |
series | Complex & Intelligent Systems |
spelling | doaj.art-6b11ad529094460eadd89bd74ec8866a2023-10-29T12:41:10ZengSpringerComplex & Intelligent Systems2199-45362198-60532023-06-01967381739310.1007/s40747-023-01133-0Concentration or distraction? A synergetic-based attention weights optimization methodZihao Wang0Haifeng Li1Lin Ma2Feng Jiang3Faculty of Computing, Harbin Institute of TechnologyFaculty of Computing, Harbin Institute of TechnologyFaculty of Computing, Harbin Institute of TechnologySchool of Medicine and Health, Harbin Institute of TechnologyAbstract The attention mechanism empowers deep learning to a broader range of applications, but the contribution of the attention module is highly controversial. Research on modern Hopfield networks indicates that the attention mechanism can also be used in shallow networks. Its automatic sample filtering facilitates instance extraction in Multiple Instances Learning tasks. Since the attention mechanism has a clear contribution and intuitive performance in shallow networks, this paper further investigates its optimization method based on the recurrent neural network. Through comprehensive comparison, we find that the Synergetic Neural Network has the advantage of more accurate and controllable convergences and revertible converging steps. Therefore, we design the Syn layer based on the Synergetic Neural Network and propose the novel invertible activation function as the forward and backward update formula for attention weights concentration or distraction. Experimental results show that our method outperforms other methods in all Multiple Instances Learning benchmark datasets. Concentration improves the robustness of the results, while distraction expands the instance observing space and yields better results. Codes available at https://github.com/wzh134/Syn .https://doi.org/10.1007/s40747-023-01133-0AttentionSynergetic neural networkRecurrent neural networkMultiple instance learningShallow network |
spellingShingle | Zihao Wang Haifeng Li Lin Ma Feng Jiang Concentration or distraction? A synergetic-based attention weights optimization method Complex & Intelligent Systems Attention Synergetic neural network Recurrent neural network Multiple instance learning Shallow network |
title | Concentration or distraction? A synergetic-based attention weights optimization method |
title_full | Concentration or distraction? A synergetic-based attention weights optimization method |
title_fullStr | Concentration or distraction? A synergetic-based attention weights optimization method |
title_full_unstemmed | Concentration or distraction? A synergetic-based attention weights optimization method |
title_short | Concentration or distraction? A synergetic-based attention weights optimization method |
title_sort | concentration or distraction a synergetic based attention weights optimization method |
topic | Attention Synergetic neural network Recurrent neural network Multiple instance learning Shallow network |
url | https://doi.org/10.1007/s40747-023-01133-0 |
work_keys_str_mv | AT zihaowang concentrationordistractionasynergeticbasedattentionweightsoptimizationmethod AT haifengli concentrationordistractionasynergeticbasedattentionweightsoptimizationmethod AT linma concentrationordistractionasynergeticbasedattentionweightsoptimizationmethod AT fengjiang concentrationordistractionasynergeticbasedattentionweightsoptimizationmethod |