Dimension reduction and outlier detection of 3-D shapes derived from multi-organ CT images

Abstract Background Unsupervised clustering and outlier detection are important in medical research to understand the distributional composition of a collective of patients. A number of clustering methods exist, also for high-dimensional data after dimension reduction. Clustering and outlier detecti...

Full description

Bibliographic Details
Main Authors: Michael Selle, Magdalena Kircher, Cornelia Schwennen, Christian Visscher, Klaus Jung
Format: Article
Language:English
Published: BMC 2024-02-01
Series:BMC Medical Informatics and Decision Making
Subjects:
Online Access:https://doi.org/10.1186/s12911-024-02457-8
_version_ 1827327261812981760
author Michael Selle
Magdalena Kircher
Cornelia Schwennen
Christian Visscher
Klaus Jung
author_facet Michael Selle
Magdalena Kircher
Cornelia Schwennen
Christian Visscher
Klaus Jung
author_sort Michael Selle
collection DOAJ
description Abstract Background Unsupervised clustering and outlier detection are important in medical research to understand the distributional composition of a collective of patients. A number of clustering methods exist, also for high-dimensional data after dimension reduction. Clustering and outlier detection may, however, become less robust or contradictory if multiple high-dimensional data sets per patient exist. Such a scenario is given when the focus is on 3-D data of multiple organs per patient, and a high-dimensional feature matrix per organ is extracted. Methods We use principal component analysis (PCA), t-distributed stochastic neighbor embedding (t-SNE) and multiple co-inertia analysis (MCIA) combined with bagplots to study the distribution of multi-organ 3-D data taken by computed tomography scans. After point-set registration of multiple organs from two public data sets, multiple hundred shape features are extracted per organ. While PCA and t-SNE can only be applied to each organ individually, MCIA can project the data of all organs into the same low-dimensional space. Results MCIA is the only approach, here, with which data of all organs can be projected into the same low-dimensional space. We studied how frequently (i.e., by how many organs) a patient was classified to belong to the inner or outer 50% of the population, or as an outlier. Outliers could only be detected with MCIA and PCA. MCIA and t-SNE were more robust in judging the distributional location of a patient in contrast to PCA. Conclusions MCIA is more appropriate and robust in judging the distributional location of a patient in the case of multiple high-dimensional data sets per patient. It is still recommendable to apply PCA or t-SNE in parallel to MCIA to study the location of individual organs.
first_indexed 2024-03-07T14:57:37Z
format Article
id doaj.art-94744ab41ff743dea2b791fe63a3c123
institution Directory Open Access Journal
issn 1472-6947
language English
last_indexed 2024-03-07T14:57:37Z
publishDate 2024-02-01
publisher BMC
record_format Article
series BMC Medical Informatics and Decision Making
spelling doaj.art-94744ab41ff743dea2b791fe63a3c1232024-03-05T19:19:46ZengBMCBMC Medical Informatics and Decision Making1472-69472024-02-0124111310.1186/s12911-024-02457-8Dimension reduction and outlier detection of 3-D shapes derived from multi-organ CT imagesMichael Selle0Magdalena Kircher1Cornelia Schwennen2Christian Visscher3Klaus Jung4Institute of Animal Genomics, University of Veterinary Medicine HannoverInstitute of Animal Genomics, University of Veterinary Medicine HannoverInstitute for Animal Nutrition, University of Veterinary Medicine HannoverInstitute for Animal Nutrition, University of Veterinary Medicine HannoverInstitute of Animal Genomics, University of Veterinary Medicine HannoverAbstract Background Unsupervised clustering and outlier detection are important in medical research to understand the distributional composition of a collective of patients. A number of clustering methods exist, also for high-dimensional data after dimension reduction. Clustering and outlier detection may, however, become less robust or contradictory if multiple high-dimensional data sets per patient exist. Such a scenario is given when the focus is on 3-D data of multiple organs per patient, and a high-dimensional feature matrix per organ is extracted. Methods We use principal component analysis (PCA), t-distributed stochastic neighbor embedding (t-SNE) and multiple co-inertia analysis (MCIA) combined with bagplots to study the distribution of multi-organ 3-D data taken by computed tomography scans. After point-set registration of multiple organs from two public data sets, multiple hundred shape features are extracted per organ. While PCA and t-SNE can only be applied to each organ individually, MCIA can project the data of all organs into the same low-dimensional space. Results MCIA is the only approach, here, with which data of all organs can be projected into the same low-dimensional space. We studied how frequently (i.e., by how many organs) a patient was classified to belong to the inner or outer 50% of the population, or as an outlier. Outliers could only be detected with MCIA and PCA. MCIA and t-SNE were more robust in judging the distributional location of a patient in contrast to PCA. Conclusions MCIA is more appropriate and robust in judging the distributional location of a patient in the case of multiple high-dimensional data sets per patient. It is still recommendable to apply PCA or t-SNE in parallel to MCIA to study the location of individual organs.https://doi.org/10.1186/s12911-024-02457-8CT scansOutlier detectionDimension reductionMultiple co-inertia analysisBagplots
spellingShingle Michael Selle
Magdalena Kircher
Cornelia Schwennen
Christian Visscher
Klaus Jung
Dimension reduction and outlier detection of 3-D shapes derived from multi-organ CT images
BMC Medical Informatics and Decision Making
CT scans
Outlier detection
Dimension reduction
Multiple co-inertia analysis
Bagplots
title Dimension reduction and outlier detection of 3-D shapes derived from multi-organ CT images
title_full Dimension reduction and outlier detection of 3-D shapes derived from multi-organ CT images
title_fullStr Dimension reduction and outlier detection of 3-D shapes derived from multi-organ CT images
title_full_unstemmed Dimension reduction and outlier detection of 3-D shapes derived from multi-organ CT images
title_short Dimension reduction and outlier detection of 3-D shapes derived from multi-organ CT images
title_sort dimension reduction and outlier detection of 3 d shapes derived from multi organ ct images
topic CT scans
Outlier detection
Dimension reduction
Multiple co-inertia analysis
Bagplots
url https://doi.org/10.1186/s12911-024-02457-8
work_keys_str_mv AT michaelselle dimensionreductionandoutlierdetectionof3dshapesderivedfrommultiorganctimages
AT magdalenakircher dimensionreductionandoutlierdetectionof3dshapesderivedfrommultiorganctimages
AT corneliaschwennen dimensionreductionandoutlierdetectionof3dshapesderivedfrommultiorganctimages
AT christianvisscher dimensionreductionandoutlierdetectionof3dshapesderivedfrommultiorganctimages
AT klausjung dimensionreductionandoutlierdetectionof3dshapesderivedfrommultiorganctimages