A Large-Scale Study of Activation Functions in Modern Deep Neural Network Architectures for Efficient Convergence

Activation functions play an important role in the convergence of learning algorithms based on neural networks. Theyprovide neural networks with nonlinear ability and the possibility to fit in any complex data. However, no deep study exists in theliterature on the comportment of activation function...

Full description

Bibliographic Details
Main Authors: Andrinandrasana David Rasamoelina, Ivan Cík, Peter Sincak, Marián Mach, Lukáš Hruška
Format: Article
Language:English
Published: Asociación Española para la Inteligencia Artificial 2022-12-01
Series:Inteligencia Artificial
Subjects:
Online Access:https://journal.iberamia.org/index.php/intartif/article/view/845
_version_ 1811205091117498368
author Andrinandrasana David Rasamoelina
Ivan Cík
Peter Sincak
Marián Mach
Lukáš Hruška
author_facet Andrinandrasana David Rasamoelina
Ivan Cík
Peter Sincak
Marián Mach
Lukáš Hruška
author_sort Andrinandrasana David Rasamoelina
collection DOAJ
description Activation functions play an important role in the convergence of learning algorithms based on neural networks. Theyprovide neural networks with nonlinear ability and the possibility to fit in any complex data. However, no deep study exists in theliterature on the comportment of activation functions in modern architecture. Therefore, in this research, we compare the 18 most used activation functions on multiple datasets (CIFAR-10, CIFAR-100, CALTECH-256) using 4 different models (EfficientNet,ResNet, a variation of ResNet using the bag of tricks, and MobileNet V3). Furthermore, we explore the shape of the losslandscape of those different architectures with various activation functions. Lastly, based on the result of our experimentation,we introduce a new locally quadratic activation function namely Hytana alongside one variation Parametric Hytana whichoutperforms common activation functions and address the dying ReLU problem.
first_indexed 2024-04-12T03:24:47Z
format Article
id doaj.art-21c687e93f0c4ff0b10be6a35cf4e7fb
institution Directory Open Access Journal
issn 1137-3601
1988-3064
language English
last_indexed 2024-04-12T03:24:47Z
publishDate 2022-12-01
publisher Asociación Española para la Inteligencia Artificial
record_format Article
series Inteligencia Artificial
spelling doaj.art-21c687e93f0c4ff0b10be6a35cf4e7fb2022-12-22T03:49:44ZengAsociación Española para la Inteligencia ArtificialInteligencia Artificial1137-36011988-30642022-12-01257010.4114/intartif.vol25iss70pp95-109A Large-Scale Study of Activation Functions in Modern Deep Neural Network Architectures for Efficient ConvergenceAndrinandrasana David Rasamoelina0Ivan Cík1Peter Sincak2Marián Mach3Lukáš Hruška4Dept. of Cybernetics and Artificial Intelligence, FEI TU of Kosice, Slovak RepublicDept. of Cybernetics and Artificial Intelligence, FEI TU of Kosice, Slovak RepublicFaculty of Mechanical Engineering and Informatics, University of Miskolc, HungaryDept. of Cybernetics and Artificial Intelligence, FEI TU of Kosice, Slovak RepublicDept. of Cybernetics and Artificial Intelligence, FEI TU of Kosice, Slovak Republic Activation functions play an important role in the convergence of learning algorithms based on neural networks. Theyprovide neural networks with nonlinear ability and the possibility to fit in any complex data. However, no deep study exists in theliterature on the comportment of activation functions in modern architecture. Therefore, in this research, we compare the 18 most used activation functions on multiple datasets (CIFAR-10, CIFAR-100, CALTECH-256) using 4 different models (EfficientNet,ResNet, a variation of ResNet using the bag of tricks, and MobileNet V3). Furthermore, we explore the shape of the losslandscape of those different architectures with various activation functions. Lastly, based on the result of our experimentation,we introduce a new locally quadratic activation function namely Hytana alongside one variation Parametric Hytana whichoutperforms common activation functions and address the dying ReLU problem. https://journal.iberamia.org/index.php/intartif/article/view/845Activation FunctionComputer VisionDeep Learning
spellingShingle Andrinandrasana David Rasamoelina
Ivan Cík
Peter Sincak
Marián Mach
Lukáš Hruška
A Large-Scale Study of Activation Functions in Modern Deep Neural Network Architectures for Efficient Convergence
Inteligencia Artificial
Activation Function
Computer Vision
Deep Learning
title A Large-Scale Study of Activation Functions in Modern Deep Neural Network Architectures for Efficient Convergence
title_full A Large-Scale Study of Activation Functions in Modern Deep Neural Network Architectures for Efficient Convergence
title_fullStr A Large-Scale Study of Activation Functions in Modern Deep Neural Network Architectures for Efficient Convergence
title_full_unstemmed A Large-Scale Study of Activation Functions in Modern Deep Neural Network Architectures for Efficient Convergence
title_short A Large-Scale Study of Activation Functions in Modern Deep Neural Network Architectures for Efficient Convergence
title_sort large scale study of activation functions in modern deep neural network architectures for efficient convergence
topic Activation Function
Computer Vision
Deep Learning
url https://journal.iberamia.org/index.php/intartif/article/view/845
work_keys_str_mv AT andrinandrasanadavidrasamoelina alargescalestudyofactivationfunctionsinmoderndeepneuralnetworkarchitecturesforefficientconvergence
AT ivancik alargescalestudyofactivationfunctionsinmoderndeepneuralnetworkarchitecturesforefficientconvergence
AT petersincak alargescalestudyofactivationfunctionsinmoderndeepneuralnetworkarchitecturesforefficientconvergence
AT marianmach alargescalestudyofactivationfunctionsinmoderndeepneuralnetworkarchitecturesforefficientconvergence
AT lukashruska alargescalestudyofactivationfunctionsinmoderndeepneuralnetworkarchitecturesforefficientconvergence
AT andrinandrasanadavidrasamoelina largescalestudyofactivationfunctionsinmoderndeepneuralnetworkarchitecturesforefficientconvergence
AT ivancik largescalestudyofactivationfunctionsinmoderndeepneuralnetworkarchitecturesforefficientconvergence
AT petersincak largescalestudyofactivationfunctionsinmoderndeepneuralnetworkarchitecturesforefficientconvergence
AT marianmach largescalestudyofactivationfunctionsinmoderndeepneuralnetworkarchitecturesforefficientconvergence
AT lukashruska largescalestudyofactivationfunctionsinmoderndeepneuralnetworkarchitecturesforefficientconvergence