A Comparison of PCA-LDA and PLS-DA Techniques for Classification of Vibrational Spectra

Vibrational spectroscopies provide information about the biochemical and structural environment of molecular functional groups inside samples. Over the past few decades, Raman and infrared-absorption-based techniques have been extensively used to investigate biological materials under different path...

Full description

Bibliographic Details
Main Authors: Maria Lasalvia, Vito Capozzi, Giuseppe Perna
Format: Article
Language:English
Published: MDPI AG 2022-05-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/12/11/5345
_version_ 1827665695921405952
author Maria Lasalvia
Vito Capozzi
Giuseppe Perna
author_facet Maria Lasalvia
Vito Capozzi
Giuseppe Perna
author_sort Maria Lasalvia
collection DOAJ
description Vibrational spectroscopies provide information about the biochemical and structural environment of molecular functional groups inside samples. Over the past few decades, Raman and infrared-absorption-based techniques have been extensively used to investigate biological materials under different pathological conditions. Interesting results have been obtained, so these techniques have been proposed for use in a clinical setting for diagnostic purposes, as complementary tools to conventional cytological and histological techniques. In most cases, the differences between vibrational spectra measured for healthy and diseased samples are small, even if these small differences could contain useful information to be used in the diagnostic field. Therefore, the interpretation of the results requires the use of analysis techniques able to highlight the minimal spectral variations that characterize a dataset of measurements acquired on healthy samples from a dataset of measurements relating to samples in which a pathology occurs. Multivariate analysis techniques, which can handle large datasets and explore spectral information simultaneously, are suitable for this purpose. In the present study, two multivariate statistical techniques, principal component analysis-linear discriminate analysis (PCA-LDA) and partial least square-discriminant analysis (PLS-DA) were used to analyse three different datasets of vibrational spectra, each one including spectra of two different classes: (i) a simulated dataset comprising control-like and exposed-like spectra, (ii) a dataset of Raman spectra measured for control and proton beam-exposed MCF10A breast cells and (iii) a dataset of FTIR spectra measured for malignant non-metastatic MCF7 and metastatic MDA-MB-231 breast cancer cells. Both PCA-LDA and PLS-DA techniques were first used to build a discrimination model by using calibration sets of spectra extracted from the three datasets. Then, the classification performance was established by using test sets of unknown spectra. The achieved results point out that the built classification models were able to distinguish the different spectra types with accuracy between 93% and 100%, sensitivity between 86% and 100% and specificity between 90% and 100%. The present study confirms that vibrational spectroscopy combined with multivariate analysis techniques has considerable potential for establishing reliable diagnostic models.
first_indexed 2024-03-10T01:31:43Z
format Article
id doaj.art-822128538359437fbe4e12b21d479ea3
institution Directory Open Access Journal
issn 2076-3417
language English
last_indexed 2024-03-10T01:31:43Z
publishDate 2022-05-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj.art-822128538359437fbe4e12b21d479ea32023-11-23T13:40:08ZengMDPI AGApplied Sciences2076-34172022-05-011211534510.3390/app12115345A Comparison of PCA-LDA and PLS-DA Techniques for Classification of Vibrational SpectraMaria Lasalvia0Vito Capozzi1Giuseppe Perna2Dipartimento di Medicina Clinica e Sperimentale, Università di Foggia, 71122 Foggia, ItalyDipartimento di Medicina Clinica e Sperimentale, Università di Foggia, 71122 Foggia, ItalyDipartimento di Medicina Clinica e Sperimentale, Università di Foggia, 71122 Foggia, ItalyVibrational spectroscopies provide information about the biochemical and structural environment of molecular functional groups inside samples. Over the past few decades, Raman and infrared-absorption-based techniques have been extensively used to investigate biological materials under different pathological conditions. Interesting results have been obtained, so these techniques have been proposed for use in a clinical setting for diagnostic purposes, as complementary tools to conventional cytological and histological techniques. In most cases, the differences between vibrational spectra measured for healthy and diseased samples are small, even if these small differences could contain useful information to be used in the diagnostic field. Therefore, the interpretation of the results requires the use of analysis techniques able to highlight the minimal spectral variations that characterize a dataset of measurements acquired on healthy samples from a dataset of measurements relating to samples in which a pathology occurs. Multivariate analysis techniques, which can handle large datasets and explore spectral information simultaneously, are suitable for this purpose. In the present study, two multivariate statistical techniques, principal component analysis-linear discriminate analysis (PCA-LDA) and partial least square-discriminant analysis (PLS-DA) were used to analyse three different datasets of vibrational spectra, each one including spectra of two different classes: (i) a simulated dataset comprising control-like and exposed-like spectra, (ii) a dataset of Raman spectra measured for control and proton beam-exposed MCF10A breast cells and (iii) a dataset of FTIR spectra measured for malignant non-metastatic MCF7 and metastatic MDA-MB-231 breast cancer cells. Both PCA-LDA and PLS-DA techniques were first used to build a discrimination model by using calibration sets of spectra extracted from the three datasets. Then, the classification performance was established by using test sets of unknown spectra. The achieved results point out that the built classification models were able to distinguish the different spectra types with accuracy between 93% and 100%, sensitivity between 86% and 100% and specificity between 90% and 100%. The present study confirms that vibrational spectroscopy combined with multivariate analysis techniques has considerable potential for establishing reliable diagnostic models.https://www.mdpi.com/2076-3417/12/11/5345RamanFTIRPCA-LDAPLS-DA
spellingShingle Maria Lasalvia
Vito Capozzi
Giuseppe Perna
A Comparison of PCA-LDA and PLS-DA Techniques for Classification of Vibrational Spectra
Applied Sciences
Raman
FTIR
PCA-LDA
PLS-DA
title A Comparison of PCA-LDA and PLS-DA Techniques for Classification of Vibrational Spectra
title_full A Comparison of PCA-LDA and PLS-DA Techniques for Classification of Vibrational Spectra
title_fullStr A Comparison of PCA-LDA and PLS-DA Techniques for Classification of Vibrational Spectra
title_full_unstemmed A Comparison of PCA-LDA and PLS-DA Techniques for Classification of Vibrational Spectra
title_short A Comparison of PCA-LDA and PLS-DA Techniques for Classification of Vibrational Spectra
title_sort comparison of pca lda and pls da techniques for classification of vibrational spectra
topic Raman
FTIR
PCA-LDA
PLS-DA
url https://www.mdpi.com/2076-3417/12/11/5345
work_keys_str_mv AT marialasalvia acomparisonofpcaldaandplsdatechniquesforclassificationofvibrationalspectra
AT vitocapozzi acomparisonofpcaldaandplsdatechniquesforclassificationofvibrationalspectra
AT giuseppeperna acomparisonofpcaldaandplsdatechniquesforclassificationofvibrationalspectra
AT marialasalvia comparisonofpcaldaandplsdatechniquesforclassificationofvibrationalspectra
AT vitocapozzi comparisonofpcaldaandplsdatechniquesforclassificationofvibrationalspectra
AT giuseppeperna comparisonofpcaldaandplsdatechniquesforclassificationofvibrationalspectra