A Large-Scale Study of Activation Functions in Modern Deep Neural Network Architectures for Efficient Convergence
Activation functions play an important role in the convergence of learning algorithms based on neural networks. Theyprovide neural networks with nonlinear ability and the possibility to fit in any complex data. However, no deep study exists in theliterature on the comportment of activation function...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Asociación Española para la Inteligencia Artificial
2022-12-01
|
Series: | Inteligencia Artificial |
Subjects: | |
Online Access: | https://journal.iberamia.org/index.php/intartif/article/view/845 |
_version_ | 1811205091117498368 |
---|---|
author | Andrinandrasana David Rasamoelina Ivan Cík Peter Sincak Marián Mach Lukáš Hruška |
author_facet | Andrinandrasana David Rasamoelina Ivan Cík Peter Sincak Marián Mach Lukáš Hruška |
author_sort | Andrinandrasana David Rasamoelina |
collection | DOAJ |
description |
Activation functions play an important role in the convergence of learning algorithms based on neural networks. Theyprovide neural networks with nonlinear ability and the possibility to fit in any complex data. However, no deep study exists in theliterature on the comportment of activation functions in modern architecture. Therefore, in this research, we compare the 18 most used activation functions on multiple datasets (CIFAR-10, CIFAR-100, CALTECH-256) using 4 different models (EfficientNet,ResNet, a variation of ResNet using the bag of tricks, and MobileNet V3). Furthermore, we explore the shape of the losslandscape of those different architectures with various activation functions. Lastly, based on the result of our experimentation,we introduce a new locally quadratic activation function namely Hytana alongside one variation Parametric Hytana whichoutperforms common activation functions and address the dying ReLU problem.
|
first_indexed | 2024-04-12T03:24:47Z |
format | Article |
id | doaj.art-21c687e93f0c4ff0b10be6a35cf4e7fb |
institution | Directory Open Access Journal |
issn | 1137-3601 1988-3064 |
language | English |
last_indexed | 2024-04-12T03:24:47Z |
publishDate | 2022-12-01 |
publisher | Asociación Española para la Inteligencia Artificial |
record_format | Article |
series | Inteligencia Artificial |
spelling | doaj.art-21c687e93f0c4ff0b10be6a35cf4e7fb2022-12-22T03:49:44ZengAsociación Española para la Inteligencia ArtificialInteligencia Artificial1137-36011988-30642022-12-01257010.4114/intartif.vol25iss70pp95-109A Large-Scale Study of Activation Functions in Modern Deep Neural Network Architectures for Efficient ConvergenceAndrinandrasana David Rasamoelina0Ivan Cík1Peter Sincak2Marián Mach3Lukáš Hruška4Dept. of Cybernetics and Artificial Intelligence, FEI TU of Kosice, Slovak RepublicDept. of Cybernetics and Artificial Intelligence, FEI TU of Kosice, Slovak RepublicFaculty of Mechanical Engineering and Informatics, University of Miskolc, HungaryDept. of Cybernetics and Artificial Intelligence, FEI TU of Kosice, Slovak RepublicDept. of Cybernetics and Artificial Intelligence, FEI TU of Kosice, Slovak Republic Activation functions play an important role in the convergence of learning algorithms based on neural networks. Theyprovide neural networks with nonlinear ability and the possibility to fit in any complex data. However, no deep study exists in theliterature on the comportment of activation functions in modern architecture. Therefore, in this research, we compare the 18 most used activation functions on multiple datasets (CIFAR-10, CIFAR-100, CALTECH-256) using 4 different models (EfficientNet,ResNet, a variation of ResNet using the bag of tricks, and MobileNet V3). Furthermore, we explore the shape of the losslandscape of those different architectures with various activation functions. Lastly, based on the result of our experimentation,we introduce a new locally quadratic activation function namely Hytana alongside one variation Parametric Hytana whichoutperforms common activation functions and address the dying ReLU problem. https://journal.iberamia.org/index.php/intartif/article/view/845Activation FunctionComputer VisionDeep Learning |
spellingShingle | Andrinandrasana David Rasamoelina Ivan Cík Peter Sincak Marián Mach Lukáš Hruška A Large-Scale Study of Activation Functions in Modern Deep Neural Network Architectures for Efficient Convergence Inteligencia Artificial Activation Function Computer Vision Deep Learning |
title | A Large-Scale Study of Activation Functions in Modern Deep Neural Network Architectures for Efficient Convergence |
title_full | A Large-Scale Study of Activation Functions in Modern Deep Neural Network Architectures for Efficient Convergence |
title_fullStr | A Large-Scale Study of Activation Functions in Modern Deep Neural Network Architectures for Efficient Convergence |
title_full_unstemmed | A Large-Scale Study of Activation Functions in Modern Deep Neural Network Architectures for Efficient Convergence |
title_short | A Large-Scale Study of Activation Functions in Modern Deep Neural Network Architectures for Efficient Convergence |
title_sort | large scale study of activation functions in modern deep neural network architectures for efficient convergence |
topic | Activation Function Computer Vision Deep Learning |
url | https://journal.iberamia.org/index.php/intartif/article/view/845 |
work_keys_str_mv | AT andrinandrasanadavidrasamoelina alargescalestudyofactivationfunctionsinmoderndeepneuralnetworkarchitecturesforefficientconvergence AT ivancik alargescalestudyofactivationfunctionsinmoderndeepneuralnetworkarchitecturesforefficientconvergence AT petersincak alargescalestudyofactivationfunctionsinmoderndeepneuralnetworkarchitecturesforefficientconvergence AT marianmach alargescalestudyofactivationfunctionsinmoderndeepneuralnetworkarchitecturesforefficientconvergence AT lukashruska alargescalestudyofactivationfunctionsinmoderndeepneuralnetworkarchitecturesforefficientconvergence AT andrinandrasanadavidrasamoelina largescalestudyofactivationfunctionsinmoderndeepneuralnetworkarchitecturesforefficientconvergence AT ivancik largescalestudyofactivationfunctionsinmoderndeepneuralnetworkarchitecturesforefficientconvergence AT petersincak largescalestudyofactivationfunctionsinmoderndeepneuralnetworkarchitecturesforefficientconvergence AT marianmach largescalestudyofactivationfunctionsinmoderndeepneuralnetworkarchitecturesforefficientconvergence AT lukashruska largescalestudyofactivationfunctionsinmoderndeepneuralnetworkarchitecturesforefficientconvergence |