Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets
In many machine learning applications, measurements are sometimes incomplete or noisy resulting in missing features. In other cases, and for different reasons, the datasets are originally small, and therefore, more data samples are required to derive useful supervised or unsupervised classification...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2020-11-01
|
Series: | Applied Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/2076-3417/10/23/8481 |
_version_ | 1827701297051074560 |
---|---|
author | Cesar Federico Caiafa Jordi Solé-Casals Pere Marti-Puig Sun Zhe Toshihisa Tanaka |
author_facet | Cesar Federico Caiafa Jordi Solé-Casals Pere Marti-Puig Sun Zhe Toshihisa Tanaka |
author_sort | Cesar Federico Caiafa |
collection | DOAJ |
description | In many machine learning applications, measurements are sometimes incomplete or noisy resulting in missing features. In other cases, and for different reasons, the datasets are originally small, and therefore, more data samples are required to derive useful supervised or unsupervised classification methods. Correct handling of incomplete, noisy or small datasets in machine learning is a fundamental and classic challenge. In this article, we provide a unified review of recently proposed methods based on signal decomposition for missing features imputation (data completion), classification of noisy samples and artificial generation of new data samples (data augmentation). We illustrate the application of these signal decomposition methods in diverse selected practical machine learning examples including: brain computer interface, epileptic intracranial electroencephalogram signals classification, face recognition/verification and water networks data analysis. We show that a signal decomposition approach can provide valuable tools to improve machine learning performance with low quality datasets. |
first_indexed | 2024-03-10T14:30:40Z |
format | Article |
id | doaj.art-f1ff49b31bad4f56bd129dba2dbbe8fd |
institution | Directory Open Access Journal |
issn | 2076-3417 |
language | English |
last_indexed | 2024-03-10T14:30:40Z |
publishDate | 2020-11-01 |
publisher | MDPI AG |
record_format | Article |
series | Applied Sciences |
spelling | doaj.art-f1ff49b31bad4f56bd129dba2dbbe8fd2023-11-20T22:37:45ZengMDPI AGApplied Sciences2076-34172020-11-011023848110.3390/app10238481Decomposition Methods for Machine Learning with Small, Incomplete or Noisy DatasetsCesar Federico Caiafa0Jordi Solé-Casals1Pere Marti-Puig2Sun Zhe3Toshihisa Tanaka4Instituto Argentino de Radioastronomía—CCT La Plata, CONICET/CIC-PBA/UNLP, 1894 V. Elisa, ArgentinaData and Signal Processing Research Group, University of Vic-Central University of Catalonia, 08500 Vic, Catalonia, SpainData and Signal Processing Research Group, University of Vic-Central University of Catalonia, 08500 Vic, Catalonia, SpainComputational Engineering Applications Unit, Head Office for Information Systems and Cybersecurity, RIKEN, Wako-Shi 351-0198, JapanDepartment of Electrical and Electronic Engineering, Tokyo University of Agriculture and Technology, Tokyo 184-8588, JapanIn many machine learning applications, measurements are sometimes incomplete or noisy resulting in missing features. In other cases, and for different reasons, the datasets are originally small, and therefore, more data samples are required to derive useful supervised or unsupervised classification methods. Correct handling of incomplete, noisy or small datasets in machine learning is a fundamental and classic challenge. In this article, we provide a unified review of recently proposed methods based on signal decomposition for missing features imputation (data completion), classification of noisy samples and artificial generation of new data samples (data augmentation). We illustrate the application of these signal decomposition methods in diverse selected practical machine learning examples including: brain computer interface, epileptic intracranial electroencephalogram signals classification, face recognition/verification and water networks data analysis. We show that a signal decomposition approach can provide valuable tools to improve machine learning performance with low quality datasets.https://www.mdpi.com/2076-3417/10/23/8481empirical mode decompositionmachine learningsparse representationstensor decompositiontensor completion |
spellingShingle | Cesar Federico Caiafa Jordi Solé-Casals Pere Marti-Puig Sun Zhe Toshihisa Tanaka Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets Applied Sciences empirical mode decomposition machine learning sparse representations tensor decomposition tensor completion |
title | Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets |
title_full | Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets |
title_fullStr | Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets |
title_full_unstemmed | Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets |
title_short | Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets |
title_sort | decomposition methods for machine learning with small incomplete or noisy datasets |
topic | empirical mode decomposition machine learning sparse representations tensor decomposition tensor completion |
url | https://www.mdpi.com/2076-3417/10/23/8481 |
work_keys_str_mv | AT cesarfedericocaiafa decompositionmethodsformachinelearningwithsmallincompleteornoisydatasets AT jordisolecasals decompositionmethodsformachinelearningwithsmallincompleteornoisydatasets AT peremartipuig decompositionmethodsformachinelearningwithsmallincompleteornoisydatasets AT sunzhe decompositionmethodsformachinelearningwithsmallincompleteornoisydatasets AT toshihisatanaka decompositionmethodsformachinelearningwithsmallincompleteornoisydatasets |