A dimension reduction technique applied to regression on high dimension, low sample size neurophysiological data sets
Abstract Background A common problem in neurophysiological signal processing is the extraction of meaningful information from high dimension, low sample size data (HDLSS). We present RoLDSIS (regression on low-dimension spanned input space), a regression technique based on dimensionality reduction t...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2021-01-01
|
Series: | BMC Neuroscience |
Subjects: | |
Online Access: | https://doi.org/10.1186/s12868-020-00605-0 |
_version_ | 1818427046457507840 |
---|---|
author | Adrielle C. Santana Adriano V. Barbosa Hani C. Yehia Rafael Laboissière |
author_facet | Adrielle C. Santana Adriano V. Barbosa Hani C. Yehia Rafael Laboissière |
author_sort | Adrielle C. Santana |
collection | DOAJ |
description | Abstract Background A common problem in neurophysiological signal processing is the extraction of meaningful information from high dimension, low sample size data (HDLSS). We present RoLDSIS (regression on low-dimension spanned input space), a regression technique based on dimensionality reduction that constrains the solution to the subspace spanned by the available observations. This avoids regularization parameters in the regression procedure, as needed in shrinkage regression methods. Results We applied RoLDSIS to the EEG data collected in a phonemic identification experiment. In the experiment, morphed syllables in the continuum /da/–/ta/ were presented as acoustic stimuli to the participants and the event-related potentials (ERP) were recorded and then represented as a set of features in the time-frequency domain via the discrete wavelet transform. Each set of stimuli was chosen from a preliminary identification task executed by the participant. Physical and psychophysical attributes were associated to each stimulus. RoLDSIS was then used to infer the neurophysiological axes, in the feature space, associated with each attribute. We show that these axes can be reliably estimated and that their separation is correlated with the individual strength of phonemic categorization. The results provided by RoLDSIS are interpretable in the time-frequency domain and may be used to infer the neurophysiological correlates of phonemic categorization. A comparison with commonly used regularized regression techniques was carried out by cross-validation. Conclusion The prediction errors obtained by RoLDSIS are comparable to those obtained with Ridge Regression and smaller than those obtained with LASSO and SPLS. However, RoLDSIS achieves this without the need for cross-validation, a procedure that requires the extraction of a large amount of observations from the data and, consequently, a decreased signal-to-noise ratio when averaging trials. We show that, even though RoLDSIS is a simple technique, it is suitable for the processing and interpretation of neurophysiological signals. |
first_indexed | 2024-12-14T14:39:30Z |
format | Article |
id | doaj.art-878ae00e20e64b76a32f1f4189c2ca32 |
institution | Directory Open Access Journal |
issn | 1471-2202 |
language | English |
last_indexed | 2024-12-14T14:39:30Z |
publishDate | 2021-01-01 |
publisher | BMC |
record_format | Article |
series | BMC Neuroscience |
spelling | doaj.art-878ae00e20e64b76a32f1f4189c2ca322022-12-21T22:57:28ZengBMCBMC Neuroscience1471-22022021-01-0122111410.1186/s12868-020-00605-0A dimension reduction technique applied to regression on high dimension, low sample size neurophysiological data setsAdrielle C. Santana0Adriano V. Barbosa1Hani C. Yehia2Rafael Laboissière3Graduate Program in Electrical Engineering, Universidade Federal de Minas GeraisGraduate Program in Electrical Engineering, Universidade Federal de Minas GeraisGraduate Program in Electrical Engineering, Universidade Federal de Minas GeraisUniv. Grenoble Alpes, CNRS, LPNC UMR 5105Abstract Background A common problem in neurophysiological signal processing is the extraction of meaningful information from high dimension, low sample size data (HDLSS). We present RoLDSIS (regression on low-dimension spanned input space), a regression technique based on dimensionality reduction that constrains the solution to the subspace spanned by the available observations. This avoids regularization parameters in the regression procedure, as needed in shrinkage regression methods. Results We applied RoLDSIS to the EEG data collected in a phonemic identification experiment. In the experiment, morphed syllables in the continuum /da/–/ta/ were presented as acoustic stimuli to the participants and the event-related potentials (ERP) were recorded and then represented as a set of features in the time-frequency domain via the discrete wavelet transform. Each set of stimuli was chosen from a preliminary identification task executed by the participant. Physical and psychophysical attributes were associated to each stimulus. RoLDSIS was then used to infer the neurophysiological axes, in the feature space, associated with each attribute. We show that these axes can be reliably estimated and that their separation is correlated with the individual strength of phonemic categorization. The results provided by RoLDSIS are interpretable in the time-frequency domain and may be used to infer the neurophysiological correlates of phonemic categorization. A comparison with commonly used regularized regression techniques was carried out by cross-validation. Conclusion The prediction errors obtained by RoLDSIS are comparable to those obtained with Ridge Regression and smaller than those obtained with LASSO and SPLS. However, RoLDSIS achieves this without the need for cross-validation, a procedure that requires the extraction of a large amount of observations from the data and, consequently, a decreased signal-to-noise ratio when averaging trials. We show that, even though RoLDSIS is a simple technique, it is suitable for the processing and interpretation of neurophysiological signals.https://doi.org/10.1186/s12868-020-00605-0ElectroencephalographyEvent-related potentialsLinear regressionHigh dimension low sample size problemDimension reductionPhonemic categorization |
spellingShingle | Adrielle C. Santana Adriano V. Barbosa Hani C. Yehia Rafael Laboissière A dimension reduction technique applied to regression on high dimension, low sample size neurophysiological data sets BMC Neuroscience Electroencephalography Event-related potentials Linear regression High dimension low sample size problem Dimension reduction Phonemic categorization |
title | A dimension reduction technique applied to regression on high dimension, low sample size neurophysiological data sets |
title_full | A dimension reduction technique applied to regression on high dimension, low sample size neurophysiological data sets |
title_fullStr | A dimension reduction technique applied to regression on high dimension, low sample size neurophysiological data sets |
title_full_unstemmed | A dimension reduction technique applied to regression on high dimension, low sample size neurophysiological data sets |
title_short | A dimension reduction technique applied to regression on high dimension, low sample size neurophysiological data sets |
title_sort | dimension reduction technique applied to regression on high dimension low sample size neurophysiological data sets |
topic | Electroencephalography Event-related potentials Linear regression High dimension low sample size problem Dimension reduction Phonemic categorization |
url | https://doi.org/10.1186/s12868-020-00605-0 |
work_keys_str_mv | AT adriellecsantana adimensionreductiontechniqueappliedtoregressiononhighdimensionlowsamplesizeneurophysiologicaldatasets AT adrianovbarbosa adimensionreductiontechniqueappliedtoregressiononhighdimensionlowsamplesizeneurophysiologicaldatasets AT hanicyehia adimensionreductiontechniqueappliedtoregressiononhighdimensionlowsamplesizeneurophysiologicaldatasets AT rafaellaboissiere adimensionreductiontechniqueappliedtoregressiononhighdimensionlowsamplesizeneurophysiologicaldatasets AT adriellecsantana dimensionreductiontechniqueappliedtoregressiononhighdimensionlowsamplesizeneurophysiologicaldatasets AT adrianovbarbosa dimensionreductiontechniqueappliedtoregressiononhighdimensionlowsamplesizeneurophysiologicaldatasets AT hanicyehia dimensionreductiontechniqueappliedtoregressiononhighdimensionlowsamplesizeneurophysiologicaldatasets AT rafaellaboissiere dimensionreductiontechniqueappliedtoregressiononhighdimensionlowsamplesizeneurophysiologicaldatasets |