Comparative analysis of modified semi-supervised learning algorithms on a small amount of labeled data
The paper is devoted to improving semi-supervised clustering methods and comparing their accuracy and robustness. The proposed approach is based on expanding a clustering algorithm for using an available set of labels by replacing the distance function. Using the distance function considers not only...
Main Authors: | , |
---|---|
Format: | Article |
Language: | Ukrainian |
Published: |
Igor Sikorsky Kyiv Polytechnic Institute
2022-12-01
|
Series: | Sistemnì Doslìdženâ ta Informacìjnì Tehnologìï |
Subjects: | |
Online Access: | http://journal.iasa.kpi.ua/article/view/239726 |
_version_ | 1797351389853122560 |
---|---|
author | Leonid Lyubchyk Klym Yamkovyi |
author_facet | Leonid Lyubchyk Klym Yamkovyi |
author_sort | Leonid Lyubchyk |
collection | DOAJ |
description | The paper is devoted to improving semi-supervised clustering methods and comparing their accuracy and robustness. The proposed approach is based on expanding a clustering algorithm for using an available set of labels by replacing the distance function. Using the distance function considers not only spatial data but also available labels. Moreover, the proposed distance function could be adopted for working with ordinal variables as labels. An extended approach is also considered, based on a combination of unsupervised k-medoids methods, modified for using only labeled data during the medoids calculation step, supervised method of k nearest neighbor, and unsupervised k-means. The learning algorithm uses information about the nearest points and classes’ centers of mass. The results demonstrate that even a small amount of labeled data allows us to use semi-supervised learning, and proposed modifications improve accuracy and algorithm performance, which was found during experiments. |
first_indexed | 2024-03-08T12:59:48Z |
format | Article |
id | doaj.art-e6606ba6dd674b62841f7dd803270bb5 |
institution | Directory Open Access Journal |
issn | 1681-6048 2308-8893 |
language | Ukrainian |
last_indexed | 2024-03-08T12:59:48Z |
publishDate | 2022-12-01 |
publisher | Igor Sikorsky Kyiv Polytechnic Institute |
record_format | Article |
series | Sistemnì Doslìdženâ ta Informacìjnì Tehnologìï |
spelling | doaj.art-e6606ba6dd674b62841f7dd803270bb52024-01-19T12:36:00ZukrIgor Sikorsky Kyiv Polytechnic InstituteSistemnì Doslìdženâ ta Informacìjnì Tehnologìï1681-60482308-88932022-12-014344310.20535/SRIT.2308-8893.2022.4.03277437Comparative analysis of modified semi-supervised learning algorithms on a small amount of labeled dataLeonid Lyubchyk0https://orcid.org/0000-0002-7428-8604Klym Yamkovyi1https://orcid.org/0000-0001-9512-4150National Technical University “Kharkiv Polytechnic Institute”, KharkivNational Technical University “Kharkiv Polytechnic Institute”, KharkivThe paper is devoted to improving semi-supervised clustering methods and comparing their accuracy and robustness. The proposed approach is based on expanding a clustering algorithm for using an available set of labels by replacing the distance function. Using the distance function considers not only spatial data but also available labels. Moreover, the proposed distance function could be adopted for working with ordinal variables as labels. An extended approach is also considered, based on a combination of unsupervised k-medoids methods, modified for using only labeled data during the medoids calculation step, supervised method of k nearest neighbor, and unsupervised k-means. The learning algorithm uses information about the nearest points and classes’ centers of mass. The results demonstrate that even a small amount of labeled data allows us to use semi-supervised learning, and proposed modifications improve accuracy and algorithm performance, which was found during experiments.http://journal.iasa.kpi.ua/article/view/239726center of massclusteringdistance functionmedoidsnearest neighborsemi-supervised learning |
spellingShingle | Leonid Lyubchyk Klym Yamkovyi Comparative analysis of modified semi-supervised learning algorithms on a small amount of labeled data Sistemnì Doslìdženâ ta Informacìjnì Tehnologìï center of mass clustering distance function medoids nearest neighbor semi-supervised learning |
title | Comparative analysis of modified semi-supervised learning algorithms on a small amount of labeled data |
title_full | Comparative analysis of modified semi-supervised learning algorithms on a small amount of labeled data |
title_fullStr | Comparative analysis of modified semi-supervised learning algorithms on a small amount of labeled data |
title_full_unstemmed | Comparative analysis of modified semi-supervised learning algorithms on a small amount of labeled data |
title_short | Comparative analysis of modified semi-supervised learning algorithms on a small amount of labeled data |
title_sort | comparative analysis of modified semi supervised learning algorithms on a small amount of labeled data |
topic | center of mass clustering distance function medoids nearest neighbor semi-supervised learning |
url | http://journal.iasa.kpi.ua/article/view/239726 |
work_keys_str_mv | AT leonidlyubchyk comparativeanalysisofmodifiedsemisupervisedlearningalgorithmsonasmallamountoflabeleddata AT klymyamkovyi comparativeanalysisofmodifiedsemisupervisedlearningalgorithmsonasmallamountoflabeleddata |