Towards a unified framework of matrix derivatives

The need of processing and analyzing massive statistics simultaneously requires the derivatives of matrix-to-scalar functions (scalar-valued functions of matrices) or matrix-to-matrix functions (matrix-valued functions of matrices). Although derivatives of a matrix-to-scalar function have already be...

Full description

Bibliographic Details
Main Authors: Xu, Jianyu, Li, Guoqi, Wen, Changyun, Wu, Kun, Deng, Lei
Other Authors: School of Electrical and Electronic Engineering
Format: Journal Article
Language:English
Published: 2018
Subjects:
Online Access:https://hdl.handle.net/10356/89062
http://hdl.handle.net/10220/46090
_version_ 1811691571910803456
author Xu, Jianyu
Li, Guoqi
Wen, Changyun
Wu, Kun
Deng, Lei
author2 School of Electrical and Electronic Engineering
author_facet School of Electrical and Electronic Engineering
Xu, Jianyu
Li, Guoqi
Wen, Changyun
Wu, Kun
Deng, Lei
author_sort Xu, Jianyu
collection NTU
description The need of processing and analyzing massive statistics simultaneously requires the derivatives of matrix-to-scalar functions (scalar-valued functions of matrices) or matrix-to-matrix functions (matrix-valued functions of matrices). Although derivatives of a matrix-to-scalar function have already been defined, the way to express it in algebraic expression, however, is not as clear as that of scalar-to-scalar functions (scalar-valued functions of scalars). Due to the fact that there does not exist a uniform way of applying “chain rule” on matrix derivation, we classify approaches utilized in existing schemes into two ways: the first relies on the index notation of several matrices, and they would be eliminated while being multiplied; the second relies on the vectorizing of matrices and thus they can be dealt with in the way we treat vector-to-vector functions (vector-valued functions of vectors), which has already been settled. On one hand, we find that the first approach holds a much lower time complexity than that of the second approach in general. On the other hand, until now though we know most typical functions that can be derived in the first approach, theoretically the second approach is more generally fit for any routine of ”chain rule.” The result of the second approach, nevertheless, can be also simplified to the same order of time complexity with the first approach under certain conditions. Therefore, it is important to establish these conditions. In this paper, we establish a sufficient condition under which not only the first approach can be applied but also the time complexity of results obtained from the second approach can be reduced. This condition is described in two equivalent individual conditions, each of which is a counterpart of an approach sequentially. In addition, we generalize the methods and use these two approaches to do the derivatives under the two conditions individually. This paper enables us to unify the framework of matrix derivatives, which would result in various applications in science and engineering.
first_indexed 2024-10-01T06:22:01Z
format Journal Article
id ntu-10356/89062
institution Nanyang Technological University
language English
last_indexed 2024-10-01T06:22:01Z
publishDate 2018
record_format dspace
spelling ntu-10356/890622020-03-07T13:57:31Z Towards a unified framework of matrix derivatives Xu, Jianyu Li, Guoqi Wen, Changyun Wu, Kun Deng, Lei School of Electrical and Electronic Engineering Matrix Derivatives DRNTU::Engineering::Electrical and electronic engineering Index Notation The need of processing and analyzing massive statistics simultaneously requires the derivatives of matrix-to-scalar functions (scalar-valued functions of matrices) or matrix-to-matrix functions (matrix-valued functions of matrices). Although derivatives of a matrix-to-scalar function have already been defined, the way to express it in algebraic expression, however, is not as clear as that of scalar-to-scalar functions (scalar-valued functions of scalars). Due to the fact that there does not exist a uniform way of applying “chain rule” on matrix derivation, we classify approaches utilized in existing schemes into two ways: the first relies on the index notation of several matrices, and they would be eliminated while being multiplied; the second relies on the vectorizing of matrices and thus they can be dealt with in the way we treat vector-to-vector functions (vector-valued functions of vectors), which has already been settled. On one hand, we find that the first approach holds a much lower time complexity than that of the second approach in general. On the other hand, until now though we know most typical functions that can be derived in the first approach, theoretically the second approach is more generally fit for any routine of ”chain rule.” The result of the second approach, nevertheless, can be also simplified to the same order of time complexity with the first approach under certain conditions. Therefore, it is important to establish these conditions. In this paper, we establish a sufficient condition under which not only the first approach can be applied but also the time complexity of results obtained from the second approach can be reduced. This condition is described in two equivalent individual conditions, each of which is a counterpart of an approach sequentially. In addition, we generalize the methods and use these two approaches to do the derivatives under the two conditions individually. This paper enables us to unify the framework of matrix derivatives, which would result in various applications in science and engineering. Published version 2018-09-25T08:50:50Z 2019-12-06T17:17:01Z 2018-09-25T08:50:50Z 2019-12-06T17:17:01Z 2018 Journal Article Xu, J., Li, G., Wen, C., Wu, K., & Deng, L. (2018). Towards a Unified Framework of Matrix Derivatives. IEEE Access, 6, 47922-47934. doi:10.1109/ACCESS.2018.2867234 https://hdl.handle.net/10356/89062 http://hdl.handle.net/10220/46090 10.1109/ACCESS.2018.2867234 en IEEE Access © 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information. 13 p. application/pdf
spellingShingle Matrix Derivatives
DRNTU::Engineering::Electrical and electronic engineering
Index Notation
Xu, Jianyu
Li, Guoqi
Wen, Changyun
Wu, Kun
Deng, Lei
Towards a unified framework of matrix derivatives
title Towards a unified framework of matrix derivatives
title_full Towards a unified framework of matrix derivatives
title_fullStr Towards a unified framework of matrix derivatives
title_full_unstemmed Towards a unified framework of matrix derivatives
title_short Towards a unified framework of matrix derivatives
title_sort towards a unified framework of matrix derivatives
topic Matrix Derivatives
DRNTU::Engineering::Electrical and electronic engineering
Index Notation
url https://hdl.handle.net/10356/89062
http://hdl.handle.net/10220/46090
work_keys_str_mv AT xujianyu towardsaunifiedframeworkofmatrixderivatives
AT liguoqi towardsaunifiedframeworkofmatrixderivatives
AT wenchangyun towardsaunifiedframeworkofmatrixderivatives
AT wukun towardsaunifiedframeworkofmatrixderivatives
AT denglei towardsaunifiedframeworkofmatrixderivatives