Dimension reduction and outlier detection of 3-D shapes derived from multi-organ CT images
Abstract Background Unsupervised clustering and outlier detection are important in medical research to understand the distributional composition of a collective of patients. A number of clustering methods exist, also for high-dimensional data after dimension reduction. Clustering and outlier detecti...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2024-02-01
|
Series: | BMC Medical Informatics and Decision Making |
Subjects: | |
Online Access: | https://doi.org/10.1186/s12911-024-02457-8 |
_version_ | 1827327261812981760 |
---|---|
author | Michael Selle Magdalena Kircher Cornelia Schwennen Christian Visscher Klaus Jung |
author_facet | Michael Selle Magdalena Kircher Cornelia Schwennen Christian Visscher Klaus Jung |
author_sort | Michael Selle |
collection | DOAJ |
description | Abstract Background Unsupervised clustering and outlier detection are important in medical research to understand the distributional composition of a collective of patients. A number of clustering methods exist, also for high-dimensional data after dimension reduction. Clustering and outlier detection may, however, become less robust or contradictory if multiple high-dimensional data sets per patient exist. Such a scenario is given when the focus is on 3-D data of multiple organs per patient, and a high-dimensional feature matrix per organ is extracted. Methods We use principal component analysis (PCA), t-distributed stochastic neighbor embedding (t-SNE) and multiple co-inertia analysis (MCIA) combined with bagplots to study the distribution of multi-organ 3-D data taken by computed tomography scans. After point-set registration of multiple organs from two public data sets, multiple hundred shape features are extracted per organ. While PCA and t-SNE can only be applied to each organ individually, MCIA can project the data of all organs into the same low-dimensional space. Results MCIA is the only approach, here, with which data of all organs can be projected into the same low-dimensional space. We studied how frequently (i.e., by how many organs) a patient was classified to belong to the inner or outer 50% of the population, or as an outlier. Outliers could only be detected with MCIA and PCA. MCIA and t-SNE were more robust in judging the distributional location of a patient in contrast to PCA. Conclusions MCIA is more appropriate and robust in judging the distributional location of a patient in the case of multiple high-dimensional data sets per patient. It is still recommendable to apply PCA or t-SNE in parallel to MCIA to study the location of individual organs. |
first_indexed | 2024-03-07T14:57:37Z |
format | Article |
id | doaj.art-94744ab41ff743dea2b791fe63a3c123 |
institution | Directory Open Access Journal |
issn | 1472-6947 |
language | English |
last_indexed | 2024-03-07T14:57:37Z |
publishDate | 2024-02-01 |
publisher | BMC |
record_format | Article |
series | BMC Medical Informatics and Decision Making |
spelling | doaj.art-94744ab41ff743dea2b791fe63a3c1232024-03-05T19:19:46ZengBMCBMC Medical Informatics and Decision Making1472-69472024-02-0124111310.1186/s12911-024-02457-8Dimension reduction and outlier detection of 3-D shapes derived from multi-organ CT imagesMichael Selle0Magdalena Kircher1Cornelia Schwennen2Christian Visscher3Klaus Jung4Institute of Animal Genomics, University of Veterinary Medicine HannoverInstitute of Animal Genomics, University of Veterinary Medicine HannoverInstitute for Animal Nutrition, University of Veterinary Medicine HannoverInstitute for Animal Nutrition, University of Veterinary Medicine HannoverInstitute of Animal Genomics, University of Veterinary Medicine HannoverAbstract Background Unsupervised clustering and outlier detection are important in medical research to understand the distributional composition of a collective of patients. A number of clustering methods exist, also for high-dimensional data after dimension reduction. Clustering and outlier detection may, however, become less robust or contradictory if multiple high-dimensional data sets per patient exist. Such a scenario is given when the focus is on 3-D data of multiple organs per patient, and a high-dimensional feature matrix per organ is extracted. Methods We use principal component analysis (PCA), t-distributed stochastic neighbor embedding (t-SNE) and multiple co-inertia analysis (MCIA) combined with bagplots to study the distribution of multi-organ 3-D data taken by computed tomography scans. After point-set registration of multiple organs from two public data sets, multiple hundred shape features are extracted per organ. While PCA and t-SNE can only be applied to each organ individually, MCIA can project the data of all organs into the same low-dimensional space. Results MCIA is the only approach, here, with which data of all organs can be projected into the same low-dimensional space. We studied how frequently (i.e., by how many organs) a patient was classified to belong to the inner or outer 50% of the population, or as an outlier. Outliers could only be detected with MCIA and PCA. MCIA and t-SNE were more robust in judging the distributional location of a patient in contrast to PCA. Conclusions MCIA is more appropriate and robust in judging the distributional location of a patient in the case of multiple high-dimensional data sets per patient. It is still recommendable to apply PCA or t-SNE in parallel to MCIA to study the location of individual organs.https://doi.org/10.1186/s12911-024-02457-8CT scansOutlier detectionDimension reductionMultiple co-inertia analysisBagplots |
spellingShingle | Michael Selle Magdalena Kircher Cornelia Schwennen Christian Visscher Klaus Jung Dimension reduction and outlier detection of 3-D shapes derived from multi-organ CT images BMC Medical Informatics and Decision Making CT scans Outlier detection Dimension reduction Multiple co-inertia analysis Bagplots |
title | Dimension reduction and outlier detection of 3-D shapes derived from multi-organ CT images |
title_full | Dimension reduction and outlier detection of 3-D shapes derived from multi-organ CT images |
title_fullStr | Dimension reduction and outlier detection of 3-D shapes derived from multi-organ CT images |
title_full_unstemmed | Dimension reduction and outlier detection of 3-D shapes derived from multi-organ CT images |
title_short | Dimension reduction and outlier detection of 3-D shapes derived from multi-organ CT images |
title_sort | dimension reduction and outlier detection of 3 d shapes derived from multi organ ct images |
topic | CT scans Outlier detection Dimension reduction Multiple co-inertia analysis Bagplots |
url | https://doi.org/10.1186/s12911-024-02457-8 |
work_keys_str_mv | AT michaelselle dimensionreductionandoutlierdetectionof3dshapesderivedfrommultiorganctimages AT magdalenakircher dimensionreductionandoutlierdetectionof3dshapesderivedfrommultiorganctimages AT corneliaschwennen dimensionreductionandoutlierdetectionof3dshapesderivedfrommultiorganctimages AT christianvisscher dimensionreductionandoutlierdetectionof3dshapesderivedfrommultiorganctimages AT klausjung dimensionreductionandoutlierdetectionof3dshapesderivedfrommultiorganctimages |