Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets

In many machine learning applications, measurements are sometimes incomplete or noisy resulting in missing features. In other cases, and for different reasons, the datasets are originally small, and therefore, more data samples are required to derive useful supervised or unsupervised classification...

Full description

Bibliographic Details
Main Authors: Cesar Federico Caiafa, Jordi Solé-Casals, Pere Marti-Puig, Sun Zhe, Toshihisa Tanaka
Format: Article
Language:English
Published: MDPI AG 2020-11-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/10/23/8481
_version_ 1827701297051074560
author Cesar Federico Caiafa
Jordi Solé-Casals
Pere Marti-Puig
Sun Zhe
Toshihisa Tanaka
author_facet Cesar Federico Caiafa
Jordi Solé-Casals
Pere Marti-Puig
Sun Zhe
Toshihisa Tanaka
author_sort Cesar Federico Caiafa
collection DOAJ
description In many machine learning applications, measurements are sometimes incomplete or noisy resulting in missing features. In other cases, and for different reasons, the datasets are originally small, and therefore, more data samples are required to derive useful supervised or unsupervised classification methods. Correct handling of incomplete, noisy or small datasets in machine learning is a fundamental and classic challenge. In this article, we provide a unified review of recently proposed methods based on signal decomposition for missing features imputation (data completion), classification of noisy samples and artificial generation of new data samples (data augmentation). We illustrate the application of these signal decomposition methods in diverse selected practical machine learning examples including: brain computer interface, epileptic intracranial electroencephalogram signals classification, face recognition/verification and water networks data analysis. We show that a signal decomposition approach can provide valuable tools to improve machine learning performance with low quality datasets.
first_indexed 2024-03-10T14:30:40Z
format Article
id doaj.art-f1ff49b31bad4f56bd129dba2dbbe8fd
institution Directory Open Access Journal
issn 2076-3417
language English
last_indexed 2024-03-10T14:30:40Z
publishDate 2020-11-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj.art-f1ff49b31bad4f56bd129dba2dbbe8fd2023-11-20T22:37:45ZengMDPI AGApplied Sciences2076-34172020-11-011023848110.3390/app10238481Decomposition Methods for Machine Learning with Small, Incomplete or Noisy DatasetsCesar Federico Caiafa0Jordi Solé-Casals1Pere Marti-Puig2Sun Zhe3Toshihisa Tanaka4Instituto Argentino de Radioastronomía—CCT La Plata, CONICET/CIC-PBA/UNLP, 1894 V. Elisa, ArgentinaData and Signal Processing Research Group, University of Vic-Central University of Catalonia, 08500 Vic, Catalonia, SpainData and Signal Processing Research Group, University of Vic-Central University of Catalonia, 08500 Vic, Catalonia, SpainComputational Engineering Applications Unit, Head Office for Information Systems and Cybersecurity, RIKEN, Wako-Shi 351-0198, JapanDepartment of Electrical and Electronic Engineering, Tokyo University of Agriculture and Technology, Tokyo 184-8588, JapanIn many machine learning applications, measurements are sometimes incomplete or noisy resulting in missing features. In other cases, and for different reasons, the datasets are originally small, and therefore, more data samples are required to derive useful supervised or unsupervised classification methods. Correct handling of incomplete, noisy or small datasets in machine learning is a fundamental and classic challenge. In this article, we provide a unified review of recently proposed methods based on signal decomposition for missing features imputation (data completion), classification of noisy samples and artificial generation of new data samples (data augmentation). We illustrate the application of these signal decomposition methods in diverse selected practical machine learning examples including: brain computer interface, epileptic intracranial electroencephalogram signals classification, face recognition/verification and water networks data analysis. We show that a signal decomposition approach can provide valuable tools to improve machine learning performance with low quality datasets.https://www.mdpi.com/2076-3417/10/23/8481empirical mode decompositionmachine learningsparse representationstensor decompositiontensor completion
spellingShingle Cesar Federico Caiafa
Jordi Solé-Casals
Pere Marti-Puig
Sun Zhe
Toshihisa Tanaka
Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets
Applied Sciences
empirical mode decomposition
machine learning
sparse representations
tensor decomposition
tensor completion
title Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets
title_full Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets
title_fullStr Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets
title_full_unstemmed Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets
title_short Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets
title_sort decomposition methods for machine learning with small incomplete or noisy datasets
topic empirical mode decomposition
machine learning
sparse representations
tensor decomposition
tensor completion
url https://www.mdpi.com/2076-3417/10/23/8481
work_keys_str_mv AT cesarfedericocaiafa decompositionmethodsformachinelearningwithsmallincompleteornoisydatasets
AT jordisolecasals decompositionmethodsformachinelearningwithsmallincompleteornoisydatasets
AT peremartipuig decompositionmethodsformachinelearningwithsmallincompleteornoisydatasets
AT sunzhe decompositionmethodsformachinelearningwithsmallincompleteornoisydatasets
AT toshihisatanaka decompositionmethodsformachinelearningwithsmallincompleteornoisydatasets