Analysis of Data Sets With Learning Conflicts for Machine Learning

In supervised learning, a machine learning system requires a data set. In occasions, however, the data set may have learning conflicts that may drastically affect the performance of the learning system. This paper presents a method to analyze the learning conflicts in a data set. Several computer si...

Full description

Bibliographic Details
Main Authors: Sergio Ledesma, Mario-Alberto Ibarra-Manzano, Eduardo Cabal-Yepez, Dora-Luz Almanza-Ojeda, Juan-Gabriel Avina-Cervantes
Format: Article
Language:English
Published: IEEE 2018-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8438452/
_version_ 1818876950341484544
author Sergio Ledesma
Mario-Alberto Ibarra-Manzano
Eduardo Cabal-Yepez
Dora-Luz Almanza-Ojeda
Juan-Gabriel Avina-Cervantes
author_facet Sergio Ledesma
Mario-Alberto Ibarra-Manzano
Eduardo Cabal-Yepez
Dora-Luz Almanza-Ojeda
Juan-Gabriel Avina-Cervantes
author_sort Sergio Ledesma
collection DOAJ
description In supervised learning, a machine learning system requires a data set. In occasions, however, the data set may have learning conflicts that may drastically affect the performance of the learning system. This paper presents a method to analyze the learning conflicts in a data set. Several computer simulations to test and validate our method are performed. Two common functions in the field of optimization are used to create clean data sets. The data sets are, then, contaminated with random data, and the total learning conflict level for each case is computed. The proposed algorithm is used to identify the learning conflicts that are intentionally inserted. Next, an artificial neural network is trained and evaluated using the contaminated data set. The algorithm proposed in this paper is used in a real-world application to detect problems in a data set for a refrigeration system. It is concluded that the algorithm can be used to improve the performance of machine learning systems.
first_indexed 2024-12-19T13:50:32Z
format Article
id doaj.art-8f9d35fcfce9486996ddeffd0da35a67
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-19T13:50:32Z
publishDate 2018-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-8f9d35fcfce9486996ddeffd0da35a672022-12-21T20:18:45ZengIEEEIEEE Access2169-35362018-01-016450624507010.1109/ACCESS.2018.28651358438452Analysis of Data Sets With Learning Conflicts for Machine LearningSergio Ledesma0https://orcid.org/0000-0001-8411-8740Mario-Alberto Ibarra-Manzano1Eduardo Cabal-Yepez2Dora-Luz Almanza-Ojeda3Juan-Gabriel Avina-Cervantes4Department of Electrical and Computer Engineering, School of Engineering, University of Guanajuato, Salamanca, MexicoDepartment of Electrical and Computer Engineering, School of Engineering, University of Guanajuato, Salamanca, MexicoDepartment of Electrical and Computer Engineering, School of Engineering, University of Guanajuato, Salamanca, MexicoDepartment of Electrical and Computer Engineering, School of Engineering, University of Guanajuato, Salamanca, MexicoDepartment of Electrical and Computer Engineering, School of Engineering, University of Guanajuato, Salamanca, MexicoIn supervised learning, a machine learning system requires a data set. In occasions, however, the data set may have learning conflicts that may drastically affect the performance of the learning system. This paper presents a method to analyze the learning conflicts in a data set. Several computer simulations to test and validate our method are performed. Two common functions in the field of optimization are used to create clean data sets. The data sets are, then, contaminated with random data, and the total learning conflict level for each case is computed. The proposed algorithm is used to identify the learning conflicts that are intentionally inserted. Next, an artificial neural network is trained and evaluated using the contaminated data set. The algorithm proposed in this paper is used in a real-world application to detect problems in a data set for a refrigeration system. It is concluded that the algorithm can be used to improve the performance of machine learning systems.https://ieeexplore.ieee.org/document/8438452/Data setconflict levelconflict removalmachine learningtarget value
spellingShingle Sergio Ledesma
Mario-Alberto Ibarra-Manzano
Eduardo Cabal-Yepez
Dora-Luz Almanza-Ojeda
Juan-Gabriel Avina-Cervantes
Analysis of Data Sets With Learning Conflicts for Machine Learning
IEEE Access
Data set
conflict level
conflict removal
machine learning
target value
title Analysis of Data Sets With Learning Conflicts for Machine Learning
title_full Analysis of Data Sets With Learning Conflicts for Machine Learning
title_fullStr Analysis of Data Sets With Learning Conflicts for Machine Learning
title_full_unstemmed Analysis of Data Sets With Learning Conflicts for Machine Learning
title_short Analysis of Data Sets With Learning Conflicts for Machine Learning
title_sort analysis of data sets with learning conflicts for machine learning
topic Data set
conflict level
conflict removal
machine learning
target value
url https://ieeexplore.ieee.org/document/8438452/
work_keys_str_mv AT sergioledesma analysisofdatasetswithlearningconflictsformachinelearning
AT marioalbertoibarramanzano analysisofdatasetswithlearningconflictsformachinelearning
AT eduardocabalyepez analysisofdatasetswithlearningconflictsformachinelearning
AT doraluzalmanzaojeda analysisofdatasetswithlearningconflictsformachinelearning
AT juangabrielavinacervantes analysisofdatasetswithlearningconflictsformachinelearning