A dimension reduction technique applied to regression on high dimension, low sample size neurophysiological data sets

Abstract Background A common problem in neurophysiological signal processing is the extraction of meaningful information from high dimension, low sample size data (HDLSS). We present RoLDSIS (regression on low-dimension spanned input space), a regression technique based on dimensionality reduction t...

Full description

Bibliographic Details
Main Authors: Adrielle C. Santana, Adriano V. Barbosa, Hani C. Yehia, Rafael Laboissière
Format: Article
Language:English
Published: BMC 2021-01-01
Series:BMC Neuroscience
Subjects:
Online Access:https://doi.org/10.1186/s12868-020-00605-0
_version_ 1818427046457507840
author Adrielle C. Santana
Adriano V. Barbosa
Hani C. Yehia
Rafael Laboissière
author_facet Adrielle C. Santana
Adriano V. Barbosa
Hani C. Yehia
Rafael Laboissière
author_sort Adrielle C. Santana
collection DOAJ
description Abstract Background A common problem in neurophysiological signal processing is the extraction of meaningful information from high dimension, low sample size data (HDLSS). We present RoLDSIS (regression on low-dimension spanned input space), a regression technique based on dimensionality reduction that constrains the solution to the subspace spanned by the available observations. This avoids regularization parameters in the regression procedure, as needed in shrinkage regression methods. Results We applied RoLDSIS to the EEG data collected in a phonemic identification experiment. In the experiment, morphed syllables in the continuum /da/–/ta/ were presented as acoustic stimuli to the participants and the event-related potentials (ERP) were recorded and then represented as a set of features in the time-frequency domain via the discrete wavelet transform. Each set of stimuli was chosen from a preliminary identification task executed by the participant. Physical and psychophysical attributes were associated to each stimulus. RoLDSIS was then used to infer the neurophysiological axes, in the feature space, associated with each attribute. We show that these axes can be reliably estimated and that their separation is correlated with the individual strength of phonemic categorization. The results provided by RoLDSIS are interpretable in the time-frequency domain and may be used to infer the neurophysiological correlates of phonemic categorization. A comparison with commonly used regularized regression techniques was carried out by cross-validation. Conclusion The prediction errors obtained by RoLDSIS are comparable to those obtained with Ridge Regression and smaller than those obtained with LASSO and SPLS. However, RoLDSIS achieves this without the need for cross-validation, a procedure that requires the extraction of a large amount of observations from the data and, consequently, a decreased signal-to-noise ratio when averaging trials. We show that, even though RoLDSIS is a simple technique, it is suitable for the processing and interpretation of neurophysiological signals.
first_indexed 2024-12-14T14:39:30Z
format Article
id doaj.art-878ae00e20e64b76a32f1f4189c2ca32
institution Directory Open Access Journal
issn 1471-2202
language English
last_indexed 2024-12-14T14:39:30Z
publishDate 2021-01-01
publisher BMC
record_format Article
series BMC Neuroscience
spelling doaj.art-878ae00e20e64b76a32f1f4189c2ca322022-12-21T22:57:28ZengBMCBMC Neuroscience1471-22022021-01-0122111410.1186/s12868-020-00605-0A dimension reduction technique applied to regression on high dimension, low sample size neurophysiological data setsAdrielle C. Santana0Adriano V. Barbosa1Hani C. Yehia2Rafael Laboissière3Graduate Program in Electrical Engineering, Universidade Federal de Minas GeraisGraduate Program in Electrical Engineering, Universidade Federal de Minas GeraisGraduate Program in Electrical Engineering, Universidade Federal de Minas GeraisUniv. Grenoble Alpes, CNRS, LPNC UMR 5105Abstract Background A common problem in neurophysiological signal processing is the extraction of meaningful information from high dimension, low sample size data (HDLSS). We present RoLDSIS (regression on low-dimension spanned input space), a regression technique based on dimensionality reduction that constrains the solution to the subspace spanned by the available observations. This avoids regularization parameters in the regression procedure, as needed in shrinkage regression methods. Results We applied RoLDSIS to the EEG data collected in a phonemic identification experiment. In the experiment, morphed syllables in the continuum /da/–/ta/ were presented as acoustic stimuli to the participants and the event-related potentials (ERP) were recorded and then represented as a set of features in the time-frequency domain via the discrete wavelet transform. Each set of stimuli was chosen from a preliminary identification task executed by the participant. Physical and psychophysical attributes were associated to each stimulus. RoLDSIS was then used to infer the neurophysiological axes, in the feature space, associated with each attribute. We show that these axes can be reliably estimated and that their separation is correlated with the individual strength of phonemic categorization. The results provided by RoLDSIS are interpretable in the time-frequency domain and may be used to infer the neurophysiological correlates of phonemic categorization. A comparison with commonly used regularized regression techniques was carried out by cross-validation. Conclusion The prediction errors obtained by RoLDSIS are comparable to those obtained with Ridge Regression and smaller than those obtained with LASSO and SPLS. However, RoLDSIS achieves this without the need for cross-validation, a procedure that requires the extraction of a large amount of observations from the data and, consequently, a decreased signal-to-noise ratio when averaging trials. We show that, even though RoLDSIS is a simple technique, it is suitable for the processing and interpretation of neurophysiological signals.https://doi.org/10.1186/s12868-020-00605-0ElectroencephalographyEvent-related potentialsLinear regressionHigh dimension low sample size problemDimension reductionPhonemic categorization
spellingShingle Adrielle C. Santana
Adriano V. Barbosa
Hani C. Yehia
Rafael Laboissière
A dimension reduction technique applied to regression on high dimension, low sample size neurophysiological data sets
BMC Neuroscience
Electroencephalography
Event-related potentials
Linear regression
High dimension low sample size problem
Dimension reduction
Phonemic categorization
title A dimension reduction technique applied to regression on high dimension, low sample size neurophysiological data sets
title_full A dimension reduction technique applied to regression on high dimension, low sample size neurophysiological data sets
title_fullStr A dimension reduction technique applied to regression on high dimension, low sample size neurophysiological data sets
title_full_unstemmed A dimension reduction technique applied to regression on high dimension, low sample size neurophysiological data sets
title_short A dimension reduction technique applied to regression on high dimension, low sample size neurophysiological data sets
title_sort dimension reduction technique applied to regression on high dimension low sample size neurophysiological data sets
topic Electroencephalography
Event-related potentials
Linear regression
High dimension low sample size problem
Dimension reduction
Phonemic categorization
url https://doi.org/10.1186/s12868-020-00605-0
work_keys_str_mv AT adriellecsantana adimensionreductiontechniqueappliedtoregressiononhighdimensionlowsamplesizeneurophysiologicaldatasets
AT adrianovbarbosa adimensionreductiontechniqueappliedtoregressiononhighdimensionlowsamplesizeneurophysiologicaldatasets
AT hanicyehia adimensionreductiontechniqueappliedtoregressiononhighdimensionlowsamplesizeneurophysiologicaldatasets
AT rafaellaboissiere adimensionreductiontechniqueappliedtoregressiononhighdimensionlowsamplesizeneurophysiologicaldatasets
AT adriellecsantana dimensionreductiontechniqueappliedtoregressiononhighdimensionlowsamplesizeneurophysiologicaldatasets
AT adrianovbarbosa dimensionreductiontechniqueappliedtoregressiononhighdimensionlowsamplesizeneurophysiologicaldatasets
AT hanicyehia dimensionreductiontechniqueappliedtoregressiononhighdimensionlowsamplesizeneurophysiologicaldatasets
AT rafaellaboissiere dimensionreductiontechniqueappliedtoregressiononhighdimensionlowsamplesizeneurophysiologicaldatasets