Comparative analysis of modified semi-supervised learning algorithms on a small amount of labeled data

The paper is devoted to improving semi-supervised clustering methods and comparing their accuracy and robustness. The proposed approach is based on expanding a clustering algorithm for using an available set of labels by replacing the distance function. Using the distance function considers not only...

Full description

Bibliographic Details
Main Authors: Leonid Lyubchyk, Klym Yamkovyi
Format: Article
Language:Ukrainian
Published: Igor Sikorsky Kyiv Polytechnic Institute 2022-12-01
Series:Sistemnì Doslìdženâ ta Informacìjnì Tehnologìï
Subjects:
Online Access:http://journal.iasa.kpi.ua/article/view/239726
_version_ 1797351389853122560
author Leonid Lyubchyk
Klym Yamkovyi
author_facet Leonid Lyubchyk
Klym Yamkovyi
author_sort Leonid Lyubchyk
collection DOAJ
description The paper is devoted to improving semi-supervised clustering methods and comparing their accuracy and robustness. The proposed approach is based on expanding a clustering algorithm for using an available set of labels by replacing the distance function. Using the distance function considers not only spatial data but also available labels. Moreover, the proposed distance function could be adopted for working with ordinal variables as labels. An extended approach is also considered, based on a combination of unsupervised k-medoids methods, modified for using only labeled data during the medoids calculation step, supervised method of k nearest neighbor, and unsupervised k-means. The learning algorithm uses information about the nearest points and classes’ centers of mass. The results demonstrate that even a small amount of labeled data allows us to use semi-supervised learning, and proposed modifications improve accuracy and algorithm performance, which was found during experiments.
first_indexed 2024-03-08T12:59:48Z
format Article
id doaj.art-e6606ba6dd674b62841f7dd803270bb5
institution Directory Open Access Journal
issn 1681-6048
2308-8893
language Ukrainian
last_indexed 2024-03-08T12:59:48Z
publishDate 2022-12-01
publisher Igor Sikorsky Kyiv Polytechnic Institute
record_format Article
series Sistemnì Doslìdženâ ta Informacìjnì Tehnologìï
spelling doaj.art-e6606ba6dd674b62841f7dd803270bb52024-01-19T12:36:00ZukrIgor Sikorsky Kyiv Polytechnic InstituteSistemnì Doslìdženâ ta Informacìjnì Tehnologìï1681-60482308-88932022-12-014344310.20535/SRIT.2308-8893.2022.4.03277437Comparative analysis of modified semi-supervised learning algorithms on a small amount of labeled dataLeonid Lyubchyk0https://orcid.org/0000-0002-7428-8604Klym Yamkovyi1https://orcid.org/0000-0001-9512-4150National Technical University “Kharkiv Polytechnic Institute”, KharkivNational Technical University “Kharkiv Polytechnic Institute”, KharkivThe paper is devoted to improving semi-supervised clustering methods and comparing their accuracy and robustness. The proposed approach is based on expanding a clustering algorithm for using an available set of labels by replacing the distance function. Using the distance function considers not only spatial data but also available labels. Moreover, the proposed distance function could be adopted for working with ordinal variables as labels. An extended approach is also considered, based on a combination of unsupervised k-medoids methods, modified for using only labeled data during the medoids calculation step, supervised method of k nearest neighbor, and unsupervised k-means. The learning algorithm uses information about the nearest points and classes’ centers of mass. The results demonstrate that even a small amount of labeled data allows us to use semi-supervised learning, and proposed modifications improve accuracy and algorithm performance, which was found during experiments.http://journal.iasa.kpi.ua/article/view/239726center of massclusteringdistance functionmedoidsnearest neighborsemi-supervised learning
spellingShingle Leonid Lyubchyk
Klym Yamkovyi
Comparative analysis of modified semi-supervised learning algorithms on a small amount of labeled data
Sistemnì Doslìdženâ ta Informacìjnì Tehnologìï
center of mass
clustering
distance function
medoids
nearest neighbor
semi-supervised learning
title Comparative analysis of modified semi-supervised learning algorithms on a small amount of labeled data
title_full Comparative analysis of modified semi-supervised learning algorithms on a small amount of labeled data
title_fullStr Comparative analysis of modified semi-supervised learning algorithms on a small amount of labeled data
title_full_unstemmed Comparative analysis of modified semi-supervised learning algorithms on a small amount of labeled data
title_short Comparative analysis of modified semi-supervised learning algorithms on a small amount of labeled data
title_sort comparative analysis of modified semi supervised learning algorithms on a small amount of labeled data
topic center of mass
clustering
distance function
medoids
nearest neighbor
semi-supervised learning
url http://journal.iasa.kpi.ua/article/view/239726
work_keys_str_mv AT leonidlyubchyk comparativeanalysisofmodifiedsemisupervisedlearningalgorithmsonasmallamountoflabeleddata
AT klymyamkovyi comparativeanalysisofmodifiedsemisupervisedlearningalgorithmsonasmallamountoflabeleddata