Visualizing Population Structures by Multidimensional Scaling of Smoothed PCA-Transformed Data
Population structure can be revealed using Single Nucleotide Polymorphisms (SNPs) which are genetic variations found in the DNA sequences of individuals. Due to the large number of SNPs, visualization of SNP data is often achieved through dimensionality reduction. Although Principal Component Analys...
Main Author: | |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2023-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10041144/ |
_version_ | 1797905458506235904 |
---|---|
author | Dimitrios Charalampidis |
author_facet | Dimitrios Charalampidis |
author_sort | Dimitrios Charalampidis |
collection | DOAJ |
description | Population structure can be revealed using Single Nucleotide Polymorphisms (SNPs) which are genetic variations found in the DNA sequences of individuals. Due to the large number of SNPs, visualization of SNP data is often achieved through dimensionality reduction. Although Principal Component Analysis (PCA) has been extensively used for SNP data visualization, some other dimensionality reduction methods have been shown to be more successful in revealing complex population structures. Nevertheless, these techniques often suffer from reduced ability to preserve the global structure in the SNP data, namely the relative genetic distance between subpopulations, or from high computational cost. In this work, a method which uses Multidimensional Scaling (MDS) of smoothed PCA-transformed data (MSSPD) is proposed. MSSPD successfully reveals population structures in 2D maps, while being more effective than other techniques in preserving the global structure. In terms of computational efficiency, MSSPD is comparable to the fastest SNP visualization methods. |
first_indexed | 2024-04-10T10:05:38Z |
format | Article |
id | doaj.art-5fb0156eac00400798d96f7a1dc06fbf |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-04-10T10:05:38Z |
publishDate | 2023-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-5fb0156eac00400798d96f7a1dc06fbf2023-02-16T00:00:39ZengIEEEIEEE Access2169-35362023-01-0111135941360410.1109/ACCESS.2023.324357310041144Visualizing Population Structures by Multidimensional Scaling of Smoothed PCA-Transformed DataDimitrios Charalampidis0https://orcid.org/0000-0002-4311-5428Electrical and Computer Engineering Department, The University of New Orleans, New Orleans, LA, USAPopulation structure can be revealed using Single Nucleotide Polymorphisms (SNPs) which are genetic variations found in the DNA sequences of individuals. Due to the large number of SNPs, visualization of SNP data is often achieved through dimensionality reduction. Although Principal Component Analysis (PCA) has been extensively used for SNP data visualization, some other dimensionality reduction methods have been shown to be more successful in revealing complex population structures. Nevertheless, these techniques often suffer from reduced ability to preserve the global structure in the SNP data, namely the relative genetic distance between subpopulations, or from high computational cost. In this work, a method which uses Multidimensional Scaling (MDS) of smoothed PCA-transformed data (MSSPD) is proposed. MSSPD successfully reveals population structures in 2D maps, while being more effective than other techniques in preserving the global structure. In terms of computational efficiency, MSSPD is comparable to the fastest SNP visualization methods.https://ieeexplore.ieee.org/document/10041144/Dimensionality reductionmultidimensional scalingPCApopulation structuresingle nucleotide polymorphisms |
spellingShingle | Dimitrios Charalampidis Visualizing Population Structures by Multidimensional Scaling of Smoothed PCA-Transformed Data IEEE Access Dimensionality reduction multidimensional scaling PCA population structure single nucleotide polymorphisms |
title | Visualizing Population Structures by Multidimensional Scaling of Smoothed PCA-Transformed Data |
title_full | Visualizing Population Structures by Multidimensional Scaling of Smoothed PCA-Transformed Data |
title_fullStr | Visualizing Population Structures by Multidimensional Scaling of Smoothed PCA-Transformed Data |
title_full_unstemmed | Visualizing Population Structures by Multidimensional Scaling of Smoothed PCA-Transformed Data |
title_short | Visualizing Population Structures by Multidimensional Scaling of Smoothed PCA-Transformed Data |
title_sort | visualizing population structures by multidimensional scaling of smoothed pca transformed data |
topic | Dimensionality reduction multidimensional scaling PCA population structure single nucleotide polymorphisms |
url | https://ieeexplore.ieee.org/document/10041144/ |
work_keys_str_mv | AT dimitrioscharalampidis visualizingpopulationstructuresbymultidimensionalscalingofsmoothedpcatransformeddata |