A Large-Scale Study of Activation Functions in Modern Deep Neural Network Architectures for Efficient Convergence

Activation functions play an important role in the convergence of learning algorithms based on neural networks. Theyprovide neural networks with nonlinear ability and the possibility to fit in any complex data. However, no deep study exists in theliterature on the comportment of activation function...

Full description

Bibliographic Details
Main Authors:	Andrinandrasana David Rasamoelina, Ivan Cík, Peter Sincak, Marián Mach, Lukáš Hruška
Format:	Article
Language:	English
Published:	Asociación Española para la Inteligencia Artificial 2022-12-01
Series:	Inteligencia Artificial
Subjects:	Activation Function Computer Vision Deep Learning
Online Access:	https://journal.iberamia.org/index.php/intartif/article/view/845

_version_	1811205091117498368
author	Andrinandrasana David Rasamoelina Ivan Cík Peter Sincak Marián Mach Lukáš Hruška
author_facet	Andrinandrasana David Rasamoelina Ivan Cík Peter Sincak Marián Mach Lukáš Hruška
author_sort	Andrinandrasana David Rasamoelina
collection	DOAJ
description	Activation functions play an important role in the convergence of learning algorithms based on neural networks. Theyprovide neural networks with nonlinear ability and the possibility to fit in any complex data. However, no deep study exists in theliterature on the comportment of activation functions in modern architecture. Therefore, in this research, we compare the 18 most used activation functions on multiple datasets (CIFAR-10, CIFAR-100, CALTECH-256) using 4 different models (EfficientNet,ResNet, a variation of ResNet using the bag of tricks, and MobileNet V3). Furthermore, we explore the shape of the losslandscape of those different architectures with various activation functions. Lastly, based on the result of our experimentation,we introduce a new locally quadratic activation function namely Hytana alongside one variation Parametric Hytana whichoutperforms common activation functions and address the dying ReLU problem.
first_indexed	2024-04-12T03:24:47Z
format	Article
id	doaj.art-21c687e93f0c4ff0b10be6a35cf4e7fb
institution	Directory Open Access Journal
issn	1137-3601 1988-3064
language	English
last_indexed	2024-04-12T03:24:47Z
publishDate	2022-12-01
publisher	Asociación Española para la Inteligencia Artificial
record_format	Article
series	Inteligencia Artificial
spelling	doaj.art-21c687e93f0c4ff0b10be6a35cf4e7fb2022-12-22T03:49:44ZengAsociación Española para la Inteligencia ArtificialInteligencia Artificial1137-36011988-30642022-12-01257010.4114/intartif.vol25iss70pp95-109A Large-Scale Study of Activation Functions in Modern Deep Neural Network Architectures for Efficient ConvergenceAndrinandrasana David Rasamoelina0Ivan Cík1Peter Sincak2Marián Mach3Lukáš Hruška4Dept. of Cybernetics and Artificial Intelligence, FEI TU of Kosice, Slovak RepublicDept. of Cybernetics and Artificial Intelligence, FEI TU of Kosice, Slovak RepublicFaculty of Mechanical Engineering and Informatics, University of Miskolc, HungaryDept. of Cybernetics and Artificial Intelligence, FEI TU of Kosice, Slovak RepublicDept. of Cybernetics and Artificial Intelligence, FEI TU of Kosice, Slovak Republic Activation functions play an important role in the convergence of learning algorithms based on neural networks. Theyprovide neural networks with nonlinear ability and the possibility to fit in any complex data. However, no deep study exists in theliterature on the comportment of activation functions in modern architecture. Therefore, in this research, we compare the 18 most used activation functions on multiple datasets (CIFAR-10, CIFAR-100, CALTECH-256) using 4 different models (EfficientNet,ResNet, a variation of ResNet using the bag of tricks, and MobileNet V3). Furthermore, we explore the shape of the losslandscape of those different architectures with various activation functions. Lastly, based on the result of our experimentation,we introduce a new locally quadratic activation function namely Hytana alongside one variation Parametric Hytana whichoutperforms common activation functions and address the dying ReLU problem. https://journal.iberamia.org/index.php/intartif/article/view/845Activation FunctionComputer VisionDeep Learning
spellingShingle	Andrinandrasana David Rasamoelina Ivan Cík Peter Sincak Marián Mach Lukáš Hruška A Large-Scale Study of Activation Functions in Modern Deep Neural Network Architectures for Efficient Convergence Inteligencia Artificial Activation Function Computer Vision Deep Learning
title	A Large-Scale Study of Activation Functions in Modern Deep Neural Network Architectures for Efficient Convergence
title_full	A Large-Scale Study of Activation Functions in Modern Deep Neural Network Architectures for Efficient Convergence
title_fullStr	A Large-Scale Study of Activation Functions in Modern Deep Neural Network Architectures for Efficient Convergence
title_full_unstemmed	A Large-Scale Study of Activation Functions in Modern Deep Neural Network Architectures for Efficient Convergence
title_short	A Large-Scale Study of Activation Functions in Modern Deep Neural Network Architectures for Efficient Convergence
title_sort	large scale study of activation functions in modern deep neural network architectures for efficient convergence
topic	Activation Function Computer Vision Deep Learning
url	https://journal.iberamia.org/index.php/intartif/article/view/845
work_keys_str_mv	AT andrinandrasanadavidrasamoelina alargescalestudyofactivationfunctionsinmoderndeepneuralnetworkarchitecturesforefficientconvergence AT ivancik alargescalestudyofactivationfunctionsinmoderndeepneuralnetworkarchitecturesforefficientconvergence AT petersincak alargescalestudyofactivationfunctionsinmoderndeepneuralnetworkarchitecturesforefficientconvergence AT marianmach alargescalestudyofactivationfunctionsinmoderndeepneuralnetworkarchitecturesforefficientconvergence AT lukashruska alargescalestudyofactivationfunctionsinmoderndeepneuralnetworkarchitecturesforefficientconvergence AT andrinandrasanadavidrasamoelina largescalestudyofactivationfunctionsinmoderndeepneuralnetworkarchitecturesforefficientconvergence AT ivancik largescalestudyofactivationfunctionsinmoderndeepneuralnetworkarchitecturesforefficientconvergence AT petersincak largescalestudyofactivationfunctionsinmoderndeepneuralnetworkarchitecturesforefficientconvergence AT marianmach largescalestudyofactivationfunctionsinmoderndeepneuralnetworkarchitecturesforefficientconvergence AT lukashruska largescalestudyofactivationfunctionsinmoderndeepneuralnetworkarchitecturesforefficientconvergence

A Large-Scale Study of Activation Functions in Modern Deep Neural Network Architectures for Efficient Convergence

Similar Items