Logistic Biplot by Conjugate Gradient Algorithms and Iterated SVD

Multivariate binary data are increasingly frequent in practice. Although some adaptations of principal component analysis are used to reduce dimensionality for this kind of data, none of them provide a simultaneous representation of rows and columns (biplot). Recently, a technique named logistic bip...

Full description

Bibliographic Details
Main Authors:	Jose Giovany Babativa-Márquez, José Luis Vicente-Villardón
Format:	Article
Language:	English
Published:	MDPI AG 2021-08-01
Series:	Mathematics
Subjects:	binary data logistic biplot optimization methods conjugate gradient algorithm coordinate descent algorithm MM algorithm
Online Access:	https://www.mdpi.com/2227-7390/9/16/2015

_version_	1797523006517411840
author	Jose Giovany Babativa-Márquez José Luis Vicente-Villardón
author_facet	Jose Giovany Babativa-Márquez José Luis Vicente-Villardón
author_sort	Jose Giovany Babativa-Márquez
collection	DOAJ
description	Multivariate binary data are increasingly frequent in practice. Although some adaptations of principal component analysis are used to reduce dimensionality for this kind of data, none of them provide a simultaneous representation of rows and columns (biplot). Recently, a technique named logistic biplot (LB) has been developed to represent the rows and columns of a binary data matrix simultaneously, even though the algorithm used to fit the parameters is too computationally demanding to be useful in the presence of sparsity or when the matrix is large. We propose the fitting of an LB model using nonlinear conjugate gradient (CG) or majorization–minimization (MM) algorithms, and a cross-validation procedure is introduced to select the hyperparameter that represents the number of dimensions in the model. A Monte Carlo study that considers scenarios with several sparsity levels and different dimensions of the binary data set shows that the procedure based on cross-validation is successful in the selection of the model for all algorithms studied. The comparison of the running times shows that the CG algorithm is more efficient in the presence of sparsity and when the matrix is not very large, while the performance of the MM algorithm is better when the binary matrix is balanced or large. As a complement to the proposed methods and to give practical support, a package has been written in the R language called BiplotML. To complete the study, real binary data on gene expression methylation are used to illustrate the proposed methods.
first_indexed	2024-03-10T08:37:22Z
format	Article
id	doaj.art-b3463de636b54733b2f4c2503ab776d9
institution	Directory Open Access Journal
issn	2227-7390
language	English
last_indexed	2024-03-10T08:37:22Z
publishDate	2021-08-01
publisher	MDPI AG
record_format	Article
series	Mathematics
spelling	doaj.art-b3463de636b54733b2f4c2503ab776d92023-11-22T08:35:16ZengMDPI AGMathematics2227-73902021-08-01916201510.3390/math9162015Logistic Biplot by Conjugate Gradient Algorithms and Iterated SVDJose Giovany Babativa-Márquez0José Luis Vicente-Villardón1Department of Statistics, University of Salamanca, 37008 Salamanca, SpainDepartment of Statistics, University of Salamanca, 37008 Salamanca, SpainMultivariate binary data are increasingly frequent in practice. Although some adaptations of principal component analysis are used to reduce dimensionality for this kind of data, none of them provide a simultaneous representation of rows and columns (biplot). Recently, a technique named logistic biplot (LB) has been developed to represent the rows and columns of a binary data matrix simultaneously, even though the algorithm used to fit the parameters is too computationally demanding to be useful in the presence of sparsity or when the matrix is large. We propose the fitting of an LB model using nonlinear conjugate gradient (CG) or majorization–minimization (MM) algorithms, and a cross-validation procedure is introduced to select the hyperparameter that represents the number of dimensions in the model. A Monte Carlo study that considers scenarios with several sparsity levels and different dimensions of the binary data set shows that the procedure based on cross-validation is successful in the selection of the model for all algorithms studied. The comparison of the running times shows that the CG algorithm is more efficient in the presence of sparsity and when the matrix is not very large, while the performance of the MM algorithm is better when the binary matrix is balanced or large. As a complement to the proposed methods and to give practical support, a package has been written in the R language called BiplotML. To complete the study, real binary data on gene expression methylation are used to illustrate the proposed methods.https://www.mdpi.com/2227-7390/9/16/2015binary datalogistic biplotoptimization methodsconjugate gradient algorithmcoordinate descent algorithmMM algorithm
spellingShingle	Jose Giovany Babativa-Márquez José Luis Vicente-Villardón Logistic Biplot by Conjugate Gradient Algorithms and Iterated SVD Mathematics binary data logistic biplot optimization methods conjugate gradient algorithm coordinate descent algorithm MM algorithm
title	Logistic Biplot by Conjugate Gradient Algorithms and Iterated SVD
title_full	Logistic Biplot by Conjugate Gradient Algorithms and Iterated SVD
title_fullStr	Logistic Biplot by Conjugate Gradient Algorithms and Iterated SVD
title_full_unstemmed	Logistic Biplot by Conjugate Gradient Algorithms and Iterated SVD
title_short	Logistic Biplot by Conjugate Gradient Algorithms and Iterated SVD
title_sort	logistic biplot by conjugate gradient algorithms and iterated svd
topic	binary data logistic biplot optimization methods conjugate gradient algorithm coordinate descent algorithm MM algorithm
url	https://www.mdpi.com/2227-7390/9/16/2015
work_keys_str_mv	AT josegiovanybabativamarquez logisticbiplotbyconjugategradientalgorithmsanditeratedsvd AT joseluisvicentevillardon logisticbiplotbyconjugategradientalgorithmsanditeratedsvd

Logistic Biplot by Conjugate Gradient Algorithms and Iterated SVD

Similar Items