Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets

In many machine learning applications, measurements are sometimes incomplete or noisy resulting in missing features. In other cases, and for different reasons, the datasets are originally small, and therefore, more data samples are required to derive useful supervised or unsupervised classification...

Full description

Bibliographic Details
Main Authors:	Cesar Federico Caiafa, Jordi Solé-Casals, Pere Marti-Puig, Sun Zhe, Toshihisa Tanaka
Format:	Article
Language:	English
Published:	MDPI AG 2020-11-01
Series:	Applied Sciences
Subjects:	empirical mode decomposition machine learning sparse representations tensor decomposition tensor completion
Online Access:	https://www.mdpi.com/2076-3417/10/23/8481

_version_	1827701297051074560
author	Cesar Federico Caiafa Jordi Solé-Casals Pere Marti-Puig Sun Zhe Toshihisa Tanaka
author_facet	Cesar Federico Caiafa Jordi Solé-Casals Pere Marti-Puig Sun Zhe Toshihisa Tanaka
author_sort	Cesar Federico Caiafa
collection	DOAJ
description	In many machine learning applications, measurements are sometimes incomplete or noisy resulting in missing features. In other cases, and for different reasons, the datasets are originally small, and therefore, more data samples are required to derive useful supervised or unsupervised classification methods. Correct handling of incomplete, noisy or small datasets in machine learning is a fundamental and classic challenge. In this article, we provide a unified review of recently proposed methods based on signal decomposition for missing features imputation (data completion), classification of noisy samples and artificial generation of new data samples (data augmentation). We illustrate the application of these signal decomposition methods in diverse selected practical machine learning examples including: brain computer interface, epileptic intracranial electroencephalogram signals classification, face recognition/verification and water networks data analysis. We show that a signal decomposition approach can provide valuable tools to improve machine learning performance with low quality datasets.
first_indexed	2024-03-10T14:30:40Z
format	Article
id	doaj.art-f1ff49b31bad4f56bd129dba2dbbe8fd
institution	Directory Open Access Journal
issn	2076-3417
language	English
last_indexed	2024-03-10T14:30:40Z
publishDate	2020-11-01
publisher	MDPI AG
record_format	Article
series	Applied Sciences
spelling	doaj.art-f1ff49b31bad4f56bd129dba2dbbe8fd2023-11-20T22:37:45ZengMDPI AGApplied Sciences2076-34172020-11-011023848110.3390/app10238481Decomposition Methods for Machine Learning with Small, Incomplete or Noisy DatasetsCesar Federico Caiafa0Jordi Solé-Casals1Pere Marti-Puig2Sun Zhe3Toshihisa Tanaka4Instituto Argentino de Radioastronomía—CCT La Plata, CONICET/CIC-PBA/UNLP, 1894 V. Elisa, ArgentinaData and Signal Processing Research Group, University of Vic-Central University of Catalonia, 08500 Vic, Catalonia, SpainData and Signal Processing Research Group, University of Vic-Central University of Catalonia, 08500 Vic, Catalonia, SpainComputational Engineering Applications Unit, Head Office for Information Systems and Cybersecurity, RIKEN, Wako-Shi 351-0198, JapanDepartment of Electrical and Electronic Engineering, Tokyo University of Agriculture and Technology, Tokyo 184-8588, JapanIn many machine learning applications, measurements are sometimes incomplete or noisy resulting in missing features. In other cases, and for different reasons, the datasets are originally small, and therefore, more data samples are required to derive useful supervised or unsupervised classification methods. Correct handling of incomplete, noisy or small datasets in machine learning is a fundamental and classic challenge. In this article, we provide a unified review of recently proposed methods based on signal decomposition for missing features imputation (data completion), classification of noisy samples and artificial generation of new data samples (data augmentation). We illustrate the application of these signal decomposition methods in diverse selected practical machine learning examples including: brain computer interface, epileptic intracranial electroencephalogram signals classification, face recognition/verification and water networks data analysis. We show that a signal decomposition approach can provide valuable tools to improve machine learning performance with low quality datasets.https://www.mdpi.com/2076-3417/10/23/8481empirical mode decompositionmachine learningsparse representationstensor decompositiontensor completion
spellingShingle	Cesar Federico Caiafa Jordi Solé-Casals Pere Marti-Puig Sun Zhe Toshihisa Tanaka Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets Applied Sciences empirical mode decomposition machine learning sparse representations tensor decomposition tensor completion
title	Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets
title_full	Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets
title_fullStr	Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets
title_full_unstemmed	Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets
title_short	Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets
title_sort	decomposition methods for machine learning with small incomplete or noisy datasets
topic	empirical mode decomposition machine learning sparse representations tensor decomposition tensor completion
url	https://www.mdpi.com/2076-3417/10/23/8481
work_keys_str_mv	AT cesarfedericocaiafa decompositionmethodsformachinelearningwithsmallincompleteornoisydatasets AT jordisolecasals decompositionmethodsformachinelearningwithsmallincompleteornoisydatasets AT peremartipuig decompositionmethodsformachinelearningwithsmallincompleteornoisydatasets AT sunzhe decompositionmethodsformachinelearningwithsmallincompleteornoisydatasets AT toshihisatanaka decompositionmethodsformachinelearningwithsmallincompleteornoisydatasets

Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets

Similar Items