Correlations in Compositional Data without Log Transformations

This article proposes a method for determining the <i>p</i>-value of correlations in compositional data, i.e., those data that arise as a result of dividing original values by their sum. Data organized in this way are typical for many fields of knowledge, but there is still no consensus...

Full description

Bibliographic Details
Main Authors: Yury V. Monich, Yury D. Nechipurenko
Format: Article
Language:English
Published: MDPI AG 2023-11-01
Series:Axioms
Subjects:
Online Access:https://www.mdpi.com/2075-1680/12/12/1084
_version_ 1827575682476015616
author Yury V. Monich
Yury D. Nechipurenko
author_facet Yury V. Monich
Yury D. Nechipurenko
author_sort Yury V. Monich
collection DOAJ
description This article proposes a method for determining the <i>p</i>-value of correlations in compositional data, i.e., those data that arise as a result of dividing original values by their sum. Data organized in this way are typical for many fields of knowledge, but there is still no consensus on methods for interpreting correlations in such data. In the second decade of the new millennium, almost all newly emerging methods for solving this problem have become based on the log transformation of data. In the method proposed here, there are no log transformations. We return to the early stages of attempting to solve the problem and rely on negative shifts in correlations in the multinomial distribution. In modeling the data, we use a hybrid method that combines the hypergeometric distribution with the distribution of any other law. During our work on the calculation method, we found that the number of degrees of freedom in compositional data measures discretely only when all normalizing sums are equal and that it decreases when the sums are not equal, becoming a continuously varying quantity. Estimation of the number of degrees of freedom and the strength of its influence on the magnitude of the shift in the distribution of correlation coefficients is the basis of the proposed method.
first_indexed 2024-03-08T21:00:29Z
format Article
id doaj.art-513fa066edb74574a4a0df85901b6573
institution Directory Open Access Journal
issn 2075-1680
language English
last_indexed 2024-03-08T21:00:29Z
publishDate 2023-11-01
publisher MDPI AG
record_format Article
series Axioms
spelling doaj.art-513fa066edb74574a4a0df85901b65732023-12-22T13:53:11ZengMDPI AGAxioms2075-16802023-11-011212108410.3390/axioms12121084Correlations in Compositional Data without Log TransformationsYury V. Monich0Yury D. Nechipurenko1Institute of Linguistics, Russian Academy of Sciences, Bolshoi Kislovsky Lane, 1 bld, Moscow 125009, RussiaEngelhardt Institute of Molecular Biology, Russian Academy of Sciences, Vavilov St., 32, Moscow 119991, RussiaThis article proposes a method for determining the <i>p</i>-value of correlations in compositional data, i.e., those data that arise as a result of dividing original values by their sum. Data organized in this way are typical for many fields of knowledge, but there is still no consensus on methods for interpreting correlations in such data. In the second decade of the new millennium, almost all newly emerging methods for solving this problem have become based on the log transformation of data. In the method proposed here, there are no log transformations. We return to the early stages of attempting to solve the problem and rely on negative shifts in correlations in the multinomial distribution. In modeling the data, we use a hybrid method that combines the hypergeometric distribution with the distribution of any other law. During our work on the calculation method, we found that the number of degrees of freedom in compositional data measures discretely only when all normalizing sums are equal and that it decreases when the sums are not equal, becoming a continuously varying quantity. Estimation of the number of degrees of freedom and the strength of its influence on the magnitude of the shift in the distribution of correlation coefficients is the basis of the proposed method.https://www.mdpi.com/2075-1680/12/12/1084compositional datamathematical expectation shiftloss of degrees of freedomhybrid model
spellingShingle Yury V. Monich
Yury D. Nechipurenko
Correlations in Compositional Data without Log Transformations
Axioms
compositional data
mathematical expectation shift
loss of degrees of freedom
hybrid model
title Correlations in Compositional Data without Log Transformations
title_full Correlations in Compositional Data without Log Transformations
title_fullStr Correlations in Compositional Data without Log Transformations
title_full_unstemmed Correlations in Compositional Data without Log Transformations
title_short Correlations in Compositional Data without Log Transformations
title_sort correlations in compositional data without log transformations
topic compositional data
mathematical expectation shift
loss of degrees of freedom
hybrid model
url https://www.mdpi.com/2075-1680/12/12/1084
work_keys_str_mv AT yuryvmonich correlationsincompositionaldatawithoutlogtransformations
AT yurydnechipurenko correlationsincompositionaldatawithoutlogtransformations