Correlations in Compositional Data without Log Transformations
This article proposes a method for determining the <i>p</i>-value of correlations in compositional data, i.e., those data that arise as a result of dividing original values by their sum. Data organized in this way are typical for many fields of knowledge, but there is still no consensus...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-11-01
|
Series: | Axioms |
Subjects: | |
Online Access: | https://www.mdpi.com/2075-1680/12/12/1084 |
_version_ | 1827575682476015616 |
---|---|
author | Yury V. Monich Yury D. Nechipurenko |
author_facet | Yury V. Monich Yury D. Nechipurenko |
author_sort | Yury V. Monich |
collection | DOAJ |
description | This article proposes a method for determining the <i>p</i>-value of correlations in compositional data, i.e., those data that arise as a result of dividing original values by their sum. Data organized in this way are typical for many fields of knowledge, but there is still no consensus on methods for interpreting correlations in such data. In the second decade of the new millennium, almost all newly emerging methods for solving this problem have become based on the log transformation of data. In the method proposed here, there are no log transformations. We return to the early stages of attempting to solve the problem and rely on negative shifts in correlations in the multinomial distribution. In modeling the data, we use a hybrid method that combines the hypergeometric distribution with the distribution of any other law. During our work on the calculation method, we found that the number of degrees of freedom in compositional data measures discretely only when all normalizing sums are equal and that it decreases when the sums are not equal, becoming a continuously varying quantity. Estimation of the number of degrees of freedom and the strength of its influence on the magnitude of the shift in the distribution of correlation coefficients is the basis of the proposed method. |
first_indexed | 2024-03-08T21:00:29Z |
format | Article |
id | doaj.art-513fa066edb74574a4a0df85901b6573 |
institution | Directory Open Access Journal |
issn | 2075-1680 |
language | English |
last_indexed | 2024-03-08T21:00:29Z |
publishDate | 2023-11-01 |
publisher | MDPI AG |
record_format | Article |
series | Axioms |
spelling | doaj.art-513fa066edb74574a4a0df85901b65732023-12-22T13:53:11ZengMDPI AGAxioms2075-16802023-11-011212108410.3390/axioms12121084Correlations in Compositional Data without Log TransformationsYury V. Monich0Yury D. Nechipurenko1Institute of Linguistics, Russian Academy of Sciences, Bolshoi Kislovsky Lane, 1 bld, Moscow 125009, RussiaEngelhardt Institute of Molecular Biology, Russian Academy of Sciences, Vavilov St., 32, Moscow 119991, RussiaThis article proposes a method for determining the <i>p</i>-value of correlations in compositional data, i.e., those data that arise as a result of dividing original values by their sum. Data organized in this way are typical for many fields of knowledge, but there is still no consensus on methods for interpreting correlations in such data. In the second decade of the new millennium, almost all newly emerging methods for solving this problem have become based on the log transformation of data. In the method proposed here, there are no log transformations. We return to the early stages of attempting to solve the problem and rely on negative shifts in correlations in the multinomial distribution. In modeling the data, we use a hybrid method that combines the hypergeometric distribution with the distribution of any other law. During our work on the calculation method, we found that the number of degrees of freedom in compositional data measures discretely only when all normalizing sums are equal and that it decreases when the sums are not equal, becoming a continuously varying quantity. Estimation of the number of degrees of freedom and the strength of its influence on the magnitude of the shift in the distribution of correlation coefficients is the basis of the proposed method.https://www.mdpi.com/2075-1680/12/12/1084compositional datamathematical expectation shiftloss of degrees of freedomhybrid model |
spellingShingle | Yury V. Monich Yury D. Nechipurenko Correlations in Compositional Data without Log Transformations Axioms compositional data mathematical expectation shift loss of degrees of freedom hybrid model |
title | Correlations in Compositional Data without Log Transformations |
title_full | Correlations in Compositional Data without Log Transformations |
title_fullStr | Correlations in Compositional Data without Log Transformations |
title_full_unstemmed | Correlations in Compositional Data without Log Transformations |
title_short | Correlations in Compositional Data without Log Transformations |
title_sort | correlations in compositional data without log transformations |
topic | compositional data mathematical expectation shift loss of degrees of freedom hybrid model |
url | https://www.mdpi.com/2075-1680/12/12/1084 |
work_keys_str_mv | AT yuryvmonich correlationsincompositionaldatawithoutlogtransformations AT yurydnechipurenko correlationsincompositionaldatawithoutlogtransformations |