Mixture of experts based on confusion matrix and distribution

The parameters and computational complexity of a neural network have been improved to achieve better performance. Condition computation has been proposed to increase the model efficiency with minor losses in the performance by activating parts of network on a per example basis. But there are still g...

Full description

Bibliographic Details
Main Author: Wang, Zhisheng
Other Authors: Mao Kezhi
Format: Thesis-Master by Coursework
Language:English
Published: Nanyang Technological University 2020
Subjects:
Online Access:https://hdl.handle.net/10356/141136
_version_ 1811690452945993728
author Wang, Zhisheng
author2 Mao Kezhi
author_facet Mao Kezhi
Wang, Zhisheng
author_sort Wang, Zhisheng
collection NTU
description The parameters and computational complexity of a neural network have been improved to achieve better performance. Condition computation has been proposed to increase the model efficiency with minor losses in the performance by activating parts of network on a per example basis. But there are still great challenges in practice as for performance and algorithmic. In this dissertation, we review the related works and propose a Mixture of Experts (MoE) method to address these challenges in a flexible manner. We introduce the confusion matrix and distribution analysis, where each expert to process specific grouping is trained by confusion matrix and the output data confidence of trained model for each example is predicted by distribution analysis. A sparse combination of experts are assigned by the distribution analysis result to be activated for each case. We test this method (MoE) in the task of classification, where the computation efficiency and accuracy is critical. We also evaluate the model in 5 datasets and test the effect of the expert number. The results show that the FLOPs of network is reduced at least 10% (Fashion MNIST with 10 experts) with minor losses (or even improvement) of the accuracy.
first_indexed 2024-10-01T06:04:14Z
format Thesis-Master by Coursework
id ntu-10356/141136
institution Nanyang Technological University
language English
last_indexed 2024-10-01T06:04:14Z
publishDate 2020
publisher Nanyang Technological University
record_format dspace
spelling ntu-10356/1411362023-07-04T16:42:05Z Mixture of experts based on confusion matrix and distribution Wang, Zhisheng Mao Kezhi School of Electrical and Electronic Engineering EKZMao@ntu.edu.sg Engineering::Computer science and engineering::Theory of computation::Analysis of algorithms and problem complexity The parameters and computational complexity of a neural network have been improved to achieve better performance. Condition computation has been proposed to increase the model efficiency with minor losses in the performance by activating parts of network on a per example basis. But there are still great challenges in practice as for performance and algorithmic. In this dissertation, we review the related works and propose a Mixture of Experts (MoE) method to address these challenges in a flexible manner. We introduce the confusion matrix and distribution analysis, where each expert to process specific grouping is trained by confusion matrix and the output data confidence of trained model for each example is predicted by distribution analysis. A sparse combination of experts are assigned by the distribution analysis result to be activated for each case. We test this method (MoE) in the task of classification, where the computation efficiency and accuracy is critical. We also evaluate the model in 5 datasets and test the effect of the expert number. The results show that the FLOPs of network is reduced at least 10% (Fashion MNIST with 10 experts) with minor losses (or even improvement) of the accuracy. Master of Science (Computer Control and Automation) 2020-06-04T06:01:52Z 2020-06-04T06:01:52Z 2020 Thesis-Master by Coursework https://hdl.handle.net/10356/141136 en application/pdf Nanyang Technological University
spellingShingle Engineering::Computer science and engineering::Theory of computation::Analysis of algorithms and problem complexity
Wang, Zhisheng
Mixture of experts based on confusion matrix and distribution
title Mixture of experts based on confusion matrix and distribution
title_full Mixture of experts based on confusion matrix and distribution
title_fullStr Mixture of experts based on confusion matrix and distribution
title_full_unstemmed Mixture of experts based on confusion matrix and distribution
title_short Mixture of experts based on confusion matrix and distribution
title_sort mixture of experts based on confusion matrix and distribution
topic Engineering::Computer science and engineering::Theory of computation::Analysis of algorithms and problem complexity
url https://hdl.handle.net/10356/141136
work_keys_str_mv AT wangzhisheng mixtureofexpertsbasedonconfusionmatrixanddistribution