Partition Quantitative Assessment (PQA): A Quantitative Methodology to Assess the Embedded Noise in Clustered Omics and Systems Biology Data

Identifying groups that share common features among datasets through clustering analysis is a typical problem in many fields of science, particularly in post-omics and systems biology research. In respect of this, quantifying how a measure can cluster or organize intrinsic groups is important since...

Full description

Bibliographic Details
Main Authors: Diego A. Camacho-Hernández, Victor E. Nieto-Caballero, José E. León-Burguete, Julio A. Freyre-González
Format: Article
Language:English
Published: MDPI AG 2021-06-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/11/13/5999
_version_ 1797411566913585152
author Diego A. Camacho-Hernández
Victor E. Nieto-Caballero
José E. León-Burguete
Julio A. Freyre-González
author_facet Diego A. Camacho-Hernández
Victor E. Nieto-Caballero
José E. León-Burguete
Julio A. Freyre-González
author_sort Diego A. Camacho-Hernández
collection DOAJ
description Identifying groups that share common features among datasets through clustering analysis is a typical problem in many fields of science, particularly in post-omics and systems biology research. In respect of this, quantifying how a measure can cluster or organize intrinsic groups is important since currently there is no statistical evaluation of how ordered is, or how much noise is embedded in the resulting clustered vector. Much of the literature focuses on how well the clustering algorithm orders the data, with several measures regarding external and internal statistical validation; but no score has been developed to quantify statistically the noise in an arranged vector posterior to a clustering algorithm, i.e., how much of the clustering is due to randomness. Here, we present a quantitative methodology, based on autocorrelation, in order to assess this problem.
first_indexed 2024-03-09T04:48:00Z
format Article
id doaj.art-7c17c30370d2456e8d2233136ede33b4
institution Directory Open Access Journal
issn 2076-3417
language English
last_indexed 2024-03-09T04:48:00Z
publishDate 2021-06-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj.art-7c17c30370d2456e8d2233136ede33b42023-12-03T13:13:57ZengMDPI AGApplied Sciences2076-34172021-06-011113599910.3390/app11135999Partition Quantitative Assessment (PQA): A Quantitative Methodology to Assess the Embedded Noise in Clustered Omics and Systems Biology DataDiego A. Camacho-Hernández0Victor E. Nieto-Caballero1José E. León-Burguete2Julio A. Freyre-González3Regulatory Systems Biology Research Group, Center for Genomic Sciences, Laboratory of Systems and Synthetic Biology, Universidad Nacional Autónoma de México (UNAM), Morelos 62210, MexicoRegulatory Systems Biology Research Group, Center for Genomic Sciences, Laboratory of Systems and Synthetic Biology, Universidad Nacional Autónoma de México (UNAM), Morelos 62210, MexicoRegulatory Systems Biology Research Group, Center for Genomic Sciences, Laboratory of Systems and Synthetic Biology, Universidad Nacional Autónoma de México (UNAM), Morelos 62210, MexicoRegulatory Systems Biology Research Group, Center for Genomic Sciences, Laboratory of Systems and Synthetic Biology, Universidad Nacional Autónoma de México (UNAM), Morelos 62210, MexicoIdentifying groups that share common features among datasets through clustering analysis is a typical problem in many fields of science, particularly in post-omics and systems biology research. In respect of this, quantifying how a measure can cluster or organize intrinsic groups is important since currently there is no statistical evaluation of how ordered is, or how much noise is embedded in the resulting clustered vector. Much of the literature focuses on how well the clustering algorithm orders the data, with several measures regarding external and internal statistical validation; but no score has been developed to quantify statistically the noise in an arranged vector posterior to a clustering algorithm, i.e., how much of the clustering is due to randomness. Here, we present a quantitative methodology, based on autocorrelation, in order to assess this problem.https://www.mdpi.com/2076-3417/11/13/5999omics datadata clusteringnoise quantification
spellingShingle Diego A. Camacho-Hernández
Victor E. Nieto-Caballero
José E. León-Burguete
Julio A. Freyre-González
Partition Quantitative Assessment (PQA): A Quantitative Methodology to Assess the Embedded Noise in Clustered Omics and Systems Biology Data
Applied Sciences
omics data
data clustering
noise quantification
title Partition Quantitative Assessment (PQA): A Quantitative Methodology to Assess the Embedded Noise in Clustered Omics and Systems Biology Data
title_full Partition Quantitative Assessment (PQA): A Quantitative Methodology to Assess the Embedded Noise in Clustered Omics and Systems Biology Data
title_fullStr Partition Quantitative Assessment (PQA): A Quantitative Methodology to Assess the Embedded Noise in Clustered Omics and Systems Biology Data
title_full_unstemmed Partition Quantitative Assessment (PQA): A Quantitative Methodology to Assess the Embedded Noise in Clustered Omics and Systems Biology Data
title_short Partition Quantitative Assessment (PQA): A Quantitative Methodology to Assess the Embedded Noise in Clustered Omics and Systems Biology Data
title_sort partition quantitative assessment pqa a quantitative methodology to assess the embedded noise in clustered omics and systems biology data
topic omics data
data clustering
noise quantification
url https://www.mdpi.com/2076-3417/11/13/5999
work_keys_str_mv AT diegoacamachohernandez partitionquantitativeassessmentpqaaquantitativemethodologytoassesstheembeddednoiseinclusteredomicsandsystemsbiologydata
AT victorenietocaballero partitionquantitativeassessmentpqaaquantitativemethodologytoassesstheembeddednoiseinclusteredomicsandsystemsbiologydata
AT joseeleonburguete partitionquantitativeassessmentpqaaquantitativemethodologytoassesstheembeddednoiseinclusteredomicsandsystemsbiologydata
AT julioafreyregonzalez partitionquantitativeassessmentpqaaquantitativemethodologytoassesstheembeddednoiseinclusteredomicsandsystemsbiologydata