A Comparison of Loss Weighting Strategies for Multi task Learning in Deep Neural Networks
With the success of deep learning in a wide variety of areas, many deep multi-task learning (MTL) models have been proposed claiming improvements in performance obtained by sharing the learned structure across several related tasks. However, the dynamics of multi-task learning in deep neural network...
Main Authors: | , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2019-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/8848395/ |
_version_ | 1818878060027445248 |
---|---|
author | Ting Gong Tyler Lee Cory Stephenson Venkata Renduchintala Suchismita Padhy Anthony Ndirango Gokce Keskin Oguz H. Elibol |
author_facet | Ting Gong Tyler Lee Cory Stephenson Venkata Renduchintala Suchismita Padhy Anthony Ndirango Gokce Keskin Oguz H. Elibol |
author_sort | Ting Gong |
collection | DOAJ |
description | With the success of deep learning in a wide variety of areas, many deep multi-task learning (MTL) models have been proposed claiming improvements in performance obtained by sharing the learned structure across several related tasks. However, the dynamics of multi-task learning in deep neural networks is still not well understood at either the theoretical or experimental level. In particular, the usefulness of different task pairs is not known a priori. Practically, this means that properly combining the losses of different tasks becomes a critical issue in multi-task learning, as different methods may yield different results. In this paper, we benchmarked different multi-task learning approaches using shared trunk with task specific branches architecture across three different MTL datasets. For the first dataset, i.e. Multi-MNIST (Modified National Institute of Standards and Technology database), we thoroughly tested several weighting strategies, including simply adding task-specific cost functions together, dynamic weight average (DWA) and uncertainty weighting methods each with various amounts of training data per-task. We find that multi-task learning typically does not improve performance for a user-defined combination of tasks. Further experiments evaluated on diverse tasks and network architectures on various datasets suggested that multi-task learning requires careful selection of both task pairs and weighting strategies to equal or exceed the performance of single task learning. |
first_indexed | 2024-12-19T14:08:10Z |
format | Article |
id | doaj.art-d0b7dc367767469291058e22e4b47415 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-12-19T14:08:10Z |
publishDate | 2019-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-d0b7dc367767469291058e22e4b474152022-12-21T20:18:14ZengIEEEIEEE Access2169-35362019-01-01714162714163210.1109/ACCESS.2019.29436048848395A Comparison of Loss Weighting Strategies for Multi task Learning in Deep Neural NetworksTing Gong0https://orcid.org/0000-0001-7226-7749Tyler Lee1Cory Stephenson2Venkata Renduchintala3Suchismita Padhy4Anthony Ndirango5Gokce Keskin6Oguz H. Elibol7Intel AI Lab, Santa Clara, CA, USAIntel AI Lab, Santa Clara, CA, USAIntel AI Lab, Santa Clara, CA, USAIntel AI Lab, Santa Clara, CA, USAIntel AI Lab, Santa Clara, CA, USAIntel AI Lab, Santa Clara, CA, USAIntel AI Lab, Santa Clara, CA, USAIntel AI Lab, Santa Clara, CA, USAWith the success of deep learning in a wide variety of areas, many deep multi-task learning (MTL) models have been proposed claiming improvements in performance obtained by sharing the learned structure across several related tasks. However, the dynamics of multi-task learning in deep neural networks is still not well understood at either the theoretical or experimental level. In particular, the usefulness of different task pairs is not known a priori. Practically, this means that properly combining the losses of different tasks becomes a critical issue in multi-task learning, as different methods may yield different results. In this paper, we benchmarked different multi-task learning approaches using shared trunk with task specific branches architecture across three different MTL datasets. For the first dataset, i.e. Multi-MNIST (Modified National Institute of Standards and Technology database), we thoroughly tested several weighting strategies, including simply adding task-specific cost functions together, dynamic weight average (DWA) and uncertainty weighting methods each with various amounts of training data per-task. We find that multi-task learning typically does not improve performance for a user-defined combination of tasks. Further experiments evaluated on diverse tasks and network architectures on various datasets suggested that multi-task learning requires careful selection of both task pairs and weighting strategies to equal or exceed the performance of single task learning.https://ieeexplore.ieee.org/document/8848395/Dynamic weighting averagemulti-MNISTmulti-objective optimizationmulti-task learninguncertainty weighting |
spellingShingle | Ting Gong Tyler Lee Cory Stephenson Venkata Renduchintala Suchismita Padhy Anthony Ndirango Gokce Keskin Oguz H. Elibol A Comparison of Loss Weighting Strategies for Multi task Learning in Deep Neural Networks IEEE Access Dynamic weighting average multi-MNIST multi-objective optimization multi-task learning uncertainty weighting |
title | A Comparison of Loss Weighting Strategies for Multi task Learning in Deep Neural Networks |
title_full | A Comparison of Loss Weighting Strategies for Multi task Learning in Deep Neural Networks |
title_fullStr | A Comparison of Loss Weighting Strategies for Multi task Learning in Deep Neural Networks |
title_full_unstemmed | A Comparison of Loss Weighting Strategies for Multi task Learning in Deep Neural Networks |
title_short | A Comparison of Loss Weighting Strategies for Multi task Learning in Deep Neural Networks |
title_sort | comparison of loss weighting strategies for multi task learning in deep neural networks |
topic | Dynamic weighting average multi-MNIST multi-objective optimization multi-task learning uncertainty weighting |
url | https://ieeexplore.ieee.org/document/8848395/ |
work_keys_str_mv | AT tinggong acomparisonoflossweightingstrategiesformultitasklearningindeepneuralnetworks AT tylerlee acomparisonoflossweightingstrategiesformultitasklearningindeepneuralnetworks AT corystephenson acomparisonoflossweightingstrategiesformultitasklearningindeepneuralnetworks AT venkatarenduchintala acomparisonoflossweightingstrategiesformultitasklearningindeepneuralnetworks AT suchismitapadhy acomparisonoflossweightingstrategiesformultitasklearningindeepneuralnetworks AT anthonyndirango acomparisonoflossweightingstrategiesformultitasklearningindeepneuralnetworks AT gokcekeskin acomparisonoflossweightingstrategiesformultitasklearningindeepneuralnetworks AT oguzhelibol acomparisonoflossweightingstrategiesformultitasklearningindeepneuralnetworks AT tinggong comparisonoflossweightingstrategiesformultitasklearningindeepneuralnetworks AT tylerlee comparisonoflossweightingstrategiesformultitasklearningindeepneuralnetworks AT corystephenson comparisonoflossweightingstrategiesformultitasklearningindeepneuralnetworks AT venkatarenduchintala comparisonoflossweightingstrategiesformultitasklearningindeepneuralnetworks AT suchismitapadhy comparisonoflossweightingstrategiesformultitasklearningindeepneuralnetworks AT anthonyndirango comparisonoflossweightingstrategiesformultitasklearningindeepneuralnetworks AT gokcekeskin comparisonoflossweightingstrategiesformultitasklearningindeepneuralnetworks AT oguzhelibol comparisonoflossweightingstrategiesformultitasklearningindeepneuralnetworks |