A Comparison of Loss Weighting Strategies for Multi task Learning in Deep Neural Networks

With the success of deep learning in a wide variety of areas, many deep multi-task learning (MTL) models have been proposed claiming improvements in performance obtained by sharing the learned structure across several related tasks. However, the dynamics of multi-task learning in deep neural network...

Full description

Bibliographic Details
Main Authors: Ting Gong, Tyler Lee, Cory Stephenson, Venkata Renduchintala, Suchismita Padhy, Anthony Ndirango, Gokce Keskin, Oguz H. Elibol
Format: Article
Language:English
Published: IEEE 2019-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8848395/
_version_ 1818878060027445248
author Ting Gong
Tyler Lee
Cory Stephenson
Venkata Renduchintala
Suchismita Padhy
Anthony Ndirango
Gokce Keskin
Oguz H. Elibol
author_facet Ting Gong
Tyler Lee
Cory Stephenson
Venkata Renduchintala
Suchismita Padhy
Anthony Ndirango
Gokce Keskin
Oguz H. Elibol
author_sort Ting Gong
collection DOAJ
description With the success of deep learning in a wide variety of areas, many deep multi-task learning (MTL) models have been proposed claiming improvements in performance obtained by sharing the learned structure across several related tasks. However, the dynamics of multi-task learning in deep neural networks is still not well understood at either the theoretical or experimental level. In particular, the usefulness of different task pairs is not known a priori. Practically, this means that properly combining the losses of different tasks becomes a critical issue in multi-task learning, as different methods may yield different results. In this paper, we benchmarked different multi-task learning approaches using shared trunk with task specific branches architecture across three different MTL datasets. For the first dataset, i.e. Multi-MNIST (Modified National Institute of Standards and Technology database), we thoroughly tested several weighting strategies, including simply adding task-specific cost functions together, dynamic weight average (DWA) and uncertainty weighting methods each with various amounts of training data per-task. We find that multi-task learning typically does not improve performance for a user-defined combination of tasks. Further experiments evaluated on diverse tasks and network architectures on various datasets suggested that multi-task learning requires careful selection of both task pairs and weighting strategies to equal or exceed the performance of single task learning.
first_indexed 2024-12-19T14:08:10Z
format Article
id doaj.art-d0b7dc367767469291058e22e4b47415
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-19T14:08:10Z
publishDate 2019-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-d0b7dc367767469291058e22e4b474152022-12-21T20:18:14ZengIEEEIEEE Access2169-35362019-01-01714162714163210.1109/ACCESS.2019.29436048848395A Comparison of Loss Weighting Strategies for Multi task Learning in Deep Neural NetworksTing Gong0https://orcid.org/0000-0001-7226-7749Tyler Lee1Cory Stephenson2Venkata Renduchintala3Suchismita Padhy4Anthony Ndirango5Gokce Keskin6Oguz H. Elibol7Intel AI Lab, Santa Clara, CA, USAIntel AI Lab, Santa Clara, CA, USAIntel AI Lab, Santa Clara, CA, USAIntel AI Lab, Santa Clara, CA, USAIntel AI Lab, Santa Clara, CA, USAIntel AI Lab, Santa Clara, CA, USAIntel AI Lab, Santa Clara, CA, USAIntel AI Lab, Santa Clara, CA, USAWith the success of deep learning in a wide variety of areas, many deep multi-task learning (MTL) models have been proposed claiming improvements in performance obtained by sharing the learned structure across several related tasks. However, the dynamics of multi-task learning in deep neural networks is still not well understood at either the theoretical or experimental level. In particular, the usefulness of different task pairs is not known a priori. Practically, this means that properly combining the losses of different tasks becomes a critical issue in multi-task learning, as different methods may yield different results. In this paper, we benchmarked different multi-task learning approaches using shared trunk with task specific branches architecture across three different MTL datasets. For the first dataset, i.e. Multi-MNIST (Modified National Institute of Standards and Technology database), we thoroughly tested several weighting strategies, including simply adding task-specific cost functions together, dynamic weight average (DWA) and uncertainty weighting methods each with various amounts of training data per-task. We find that multi-task learning typically does not improve performance for a user-defined combination of tasks. Further experiments evaluated on diverse tasks and network architectures on various datasets suggested that multi-task learning requires careful selection of both task pairs and weighting strategies to equal or exceed the performance of single task learning.https://ieeexplore.ieee.org/document/8848395/Dynamic weighting averagemulti-MNISTmulti-objective optimizationmulti-task learninguncertainty weighting
spellingShingle Ting Gong
Tyler Lee
Cory Stephenson
Venkata Renduchintala
Suchismita Padhy
Anthony Ndirango
Gokce Keskin
Oguz H. Elibol
A Comparison of Loss Weighting Strategies for Multi task Learning in Deep Neural Networks
IEEE Access
Dynamic weighting average
multi-MNIST
multi-objective optimization
multi-task learning
uncertainty weighting
title A Comparison of Loss Weighting Strategies for Multi task Learning in Deep Neural Networks
title_full A Comparison of Loss Weighting Strategies for Multi task Learning in Deep Neural Networks
title_fullStr A Comparison of Loss Weighting Strategies for Multi task Learning in Deep Neural Networks
title_full_unstemmed A Comparison of Loss Weighting Strategies for Multi task Learning in Deep Neural Networks
title_short A Comparison of Loss Weighting Strategies for Multi task Learning in Deep Neural Networks
title_sort comparison of loss weighting strategies for multi task learning in deep neural networks
topic Dynamic weighting average
multi-MNIST
multi-objective optimization
multi-task learning
uncertainty weighting
url https://ieeexplore.ieee.org/document/8848395/
work_keys_str_mv AT tinggong acomparisonoflossweightingstrategiesformultitasklearningindeepneuralnetworks
AT tylerlee acomparisonoflossweightingstrategiesformultitasklearningindeepneuralnetworks
AT corystephenson acomparisonoflossweightingstrategiesformultitasklearningindeepneuralnetworks
AT venkatarenduchintala acomparisonoflossweightingstrategiesformultitasklearningindeepneuralnetworks
AT suchismitapadhy acomparisonoflossweightingstrategiesformultitasklearningindeepneuralnetworks
AT anthonyndirango acomparisonoflossweightingstrategiesformultitasklearningindeepneuralnetworks
AT gokcekeskin acomparisonoflossweightingstrategiesformultitasklearningindeepneuralnetworks
AT oguzhelibol acomparisonoflossweightingstrategiesformultitasklearningindeepneuralnetworks
AT tinggong comparisonoflossweightingstrategiesformultitasklearningindeepneuralnetworks
AT tylerlee comparisonoflossweightingstrategiesformultitasklearningindeepneuralnetworks
AT corystephenson comparisonoflossweightingstrategiesformultitasklearningindeepneuralnetworks
AT venkatarenduchintala comparisonoflossweightingstrategiesformultitasklearningindeepneuralnetworks
AT suchismitapadhy comparisonoflossweightingstrategiesformultitasklearningindeepneuralnetworks
AT anthonyndirango comparisonoflossweightingstrategiesformultitasklearningindeepneuralnetworks
AT gokcekeskin comparisonoflossweightingstrategiesformultitasklearningindeepneuralnetworks
AT oguzhelibol comparisonoflossweightingstrategiesformultitasklearningindeepneuralnetworks