Discriminative Adaptive Sets for Multi-Label Classification
Multi-label classification aims to associate multiple labels to a given data/object instance to better describe them. Multi-label data sets are common in a lot of emerging application areas like: Text/Multimedia classification, Bio-Informatics, Medical image annotations and Computer Vision to name a...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2020-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9274293/ |
_version_ | 1828735550190780416 |
---|---|
author | Muhammad Usman Ghani Muhammad Rafi Muhammad Atif Tahir |
author_facet | Muhammad Usman Ghani Muhammad Rafi Muhammad Atif Tahir |
author_sort | Muhammad Usman Ghani |
collection | DOAJ |
description | Multi-label classification aims to associate multiple labels to a given data/object instance to better describe them. Multi-label data sets are common in a lot of emerging application areas like: Text/Multimedia classification, Bio-Informatics, Medical image annotations and Computer Vision to name a few. There is a growing interest in efficient and accurate multi-label classification. There are two major approaches to perform multi-label classification (i) problem transformation methods and (ii) algorithm adaptation methods. In algorithm adaptation, the traditional classification algorithms are modified to handle multi-label data sets. One classification algorithm which is often modified to do multi-label classification is k- nearest neighbor (kNN). k-nearest neighbor is popular due to its simplicity, easy to implement and seamlessly adaptability. Despite its merits it has several drawbacks like: sensitivity of noisy data, missing values and outliers; feature scaling and often becoming inaccurate for large overlapping solution space. In this paper, a modification to kNN method is suggested for multi-label classification with three improvement strategies (i) selection of local example w.r.t. unknown example – the motivation for this comes from the fact that local and relevant space is vital for the improvement in multi-label classification; (ii) Splitting the input space into multiple sub-spaces for optimal label estimation – the motivation is to estimate label accurately in the presence of noisy labels; And (iii) selection of labels using Mean Average Precision (MAP) estimates – here our motivation is to utilize the training data effectively to maximize the hidden distribution and optimal parameters for the method. The proposed method is implemented and compared with state-of-the-art approaches based on kNN or similar approaches that effectively select and optimize relevant spaces for multi-label classification. Evaluation based on multiple metrics like Hamming loss, Precision/Recall and F-measure are used for evaluation. The suggested approach performed much better than the state-of-the-art on the datasets with strong label cardinalities. |
first_indexed | 2024-04-12T23:09:18Z |
format | Article |
id | doaj.art-4cb928db076347dcb266f444251f3e33 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-04-12T23:09:18Z |
publishDate | 2020-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-4cb928db076347dcb266f444251f3e332022-12-22T03:12:50ZengIEEEIEEE Access2169-35362020-01-01822757922759510.1109/ACCESS.2020.30417639274293Discriminative Adaptive Sets for Multi-Label ClassificationMuhammad Usman Ghani0https://orcid.org/0000-0002-0192-8850Muhammad Rafi1https://orcid.org/0000-0002-3673-5979Muhammad Atif Tahir2https://orcid.org/0000-0003-1366-8408School of Computer Science, National University of Computer and Emerging Sciences, Karachi Campus, Karachi, PakistanSchool of Computer Science, National University of Computer and Emerging Sciences, Karachi Campus, Karachi, PakistanSchool of Computer Science, National University of Computer and Emerging Sciences, Karachi Campus, Karachi, PakistanMulti-label classification aims to associate multiple labels to a given data/object instance to better describe them. Multi-label data sets are common in a lot of emerging application areas like: Text/Multimedia classification, Bio-Informatics, Medical image annotations and Computer Vision to name a few. There is a growing interest in efficient and accurate multi-label classification. There are two major approaches to perform multi-label classification (i) problem transformation methods and (ii) algorithm adaptation methods. In algorithm adaptation, the traditional classification algorithms are modified to handle multi-label data sets. One classification algorithm which is often modified to do multi-label classification is k- nearest neighbor (kNN). k-nearest neighbor is popular due to its simplicity, easy to implement and seamlessly adaptability. Despite its merits it has several drawbacks like: sensitivity of noisy data, missing values and outliers; feature scaling and often becoming inaccurate for large overlapping solution space. In this paper, a modification to kNN method is suggested for multi-label classification with three improvement strategies (i) selection of local example w.r.t. unknown example – the motivation for this comes from the fact that local and relevant space is vital for the improvement in multi-label classification; (ii) Splitting the input space into multiple sub-spaces for optimal label estimation – the motivation is to estimate label accurately in the presence of noisy labels; And (iii) selection of labels using Mean Average Precision (MAP) estimates – here our motivation is to utilize the training data effectively to maximize the hidden distribution and optimal parameters for the method. The proposed method is implemented and compared with state-of-the-art approaches based on kNN or similar approaches that effectively select and optimize relevant spaces for multi-label classification. Evaluation based on multiple metrics like Hamming loss, Precision/Recall and F-measure are used for evaluation. The suggested approach performed much better than the state-of-the-art on the datasets with strong label cardinalities.https://ieeexplore.ieee.org/document/9274293/Lazy learningmulti-label classificationinstance based learningMAP estimationnearest neighbors |
spellingShingle | Muhammad Usman Ghani Muhammad Rafi Muhammad Atif Tahir Discriminative Adaptive Sets for Multi-Label Classification IEEE Access Lazy learning multi-label classification instance based learning MAP estimation nearest neighbors |
title | Discriminative Adaptive Sets for Multi-Label Classification |
title_full | Discriminative Adaptive Sets for Multi-Label Classification |
title_fullStr | Discriminative Adaptive Sets for Multi-Label Classification |
title_full_unstemmed | Discriminative Adaptive Sets for Multi-Label Classification |
title_short | Discriminative Adaptive Sets for Multi-Label Classification |
title_sort | discriminative adaptive sets for multi label classification |
topic | Lazy learning multi-label classification instance based learning MAP estimation nearest neighbors |
url | https://ieeexplore.ieee.org/document/9274293/ |
work_keys_str_mv | AT muhammadusmanghani discriminativeadaptivesetsformultilabelclassification AT muhammadrafi discriminativeadaptivesetsformultilabelclassification AT muhammadatiftahir discriminativeadaptivesetsformultilabelclassification |