A Probabilistic Bag-to-Class Approach to Multiple-Instance Learning
Multi-instance (MI) learning is a branch of machine learning, where each object (bag) consists of multiple feature vectors (instances)—for example, an image consisting of multiple patches and their corresponding feature vectors. In MI classification, each bag in the training set has a class label, b...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2020-06-01
|
Series: | Data |
Subjects: | |
Online Access: | https://www.mdpi.com/2306-5729/5/2/56 |
_version_ | 1797564075855577088 |
---|---|
author | Kajsa Møllersen Jon Yngve Hardeberg Fred Godtliebsen |
author_facet | Kajsa Møllersen Jon Yngve Hardeberg Fred Godtliebsen |
author_sort | Kajsa Møllersen |
collection | DOAJ |
description | Multi-instance (MI) learning is a branch of machine learning, where each object (bag) consists of multiple feature vectors (instances)—for example, an image consisting of multiple patches and their corresponding feature vectors. In MI classification, each bag in the training set has a class label, but the instances are unlabeled. The instances are most commonly regarded as a set of points in a multi-dimensional space. Alternatively, instances are viewed as realizations of random vectors with corresponding probability distribution, where the bag is the distribution, not the realizations. By introducing the probability distribution space to bag-level classification problems, dissimilarities between probability distributions (divergences) can be applied. The bag-to-bag Kullback–Leibler information is asymptotically the best classifier, but the typical sparseness of MI training sets is an obstacle. We introduce bag-to-class divergence to MI learning, emphasizing the hierarchical nature of the random vectors that makes bags from the same class different. We propose two properties for bag-to-class divergences, and an additional property for sparse training sets, and propose a dissimilarity measure that fulfils them. Its performance is demonstrated on synthetic and real data. The probability distribution space is valid for MI learning, both for the theoretical analysis and applications. |
first_indexed | 2024-03-10T18:52:18Z |
format | Article |
id | doaj.art-60bd907152734382bf21501f0ca5e065 |
institution | Directory Open Access Journal |
issn | 2306-5729 |
language | English |
last_indexed | 2024-03-10T18:52:18Z |
publishDate | 2020-06-01 |
publisher | MDPI AG |
record_format | Article |
series | Data |
spelling | doaj.art-60bd907152734382bf21501f0ca5e0652023-11-20T05:00:33ZengMDPI AGData2306-57292020-06-01525610.3390/data5020056A Probabilistic Bag-to-Class Approach to Multiple-Instance LearningKajsa Møllersen0Jon Yngve Hardeberg1Fred Godtliebsen2Department of Community Medicine, Faculty of Health Science, UiT The Arctic University of Norway, N-9037 Tromsø, NorwayDepartment of Computer Science, Faculty of Information Technology and Electrical Engineering, NTNU—Norwegian University of Science and Technology, N-2815 Gjøvik, NorwayDepartment of Mathematics and Statistics, Faculty of Science and Technology, UiT The Arctic University of Norway, N-9037 Tromsø, NorwayMulti-instance (MI) learning is a branch of machine learning, where each object (bag) consists of multiple feature vectors (instances)—for example, an image consisting of multiple patches and their corresponding feature vectors. In MI classification, each bag in the training set has a class label, but the instances are unlabeled. The instances are most commonly regarded as a set of points in a multi-dimensional space. Alternatively, instances are viewed as realizations of random vectors with corresponding probability distribution, where the bag is the distribution, not the realizations. By introducing the probability distribution space to bag-level classification problems, dissimilarities between probability distributions (divergences) can be applied. The bag-to-bag Kullback–Leibler information is asymptotically the best classifier, but the typical sparseness of MI training sets is an obstacle. We introduce bag-to-class divergence to MI learning, emphasizing the hierarchical nature of the random vectors that makes bags from the same class different. We propose two properties for bag-to-class divergences, and an additional property for sparse training sets, and propose a dissimilarity measure that fulfils them. Its performance is demonstrated on synthetic and real data. The probability distribution space is valid for MI learning, both for the theoretical analysis and applications.https://www.mdpi.com/2306-5729/5/2/56image classificationmulti-instance learningdivergencedissimilaritybag-to-classKullback–Leibler |
spellingShingle | Kajsa Møllersen Jon Yngve Hardeberg Fred Godtliebsen A Probabilistic Bag-to-Class Approach to Multiple-Instance Learning Data image classification multi-instance learning divergence dissimilarity bag-to-class Kullback–Leibler |
title | A Probabilistic Bag-to-Class Approach to Multiple-Instance Learning |
title_full | A Probabilistic Bag-to-Class Approach to Multiple-Instance Learning |
title_fullStr | A Probabilistic Bag-to-Class Approach to Multiple-Instance Learning |
title_full_unstemmed | A Probabilistic Bag-to-Class Approach to Multiple-Instance Learning |
title_short | A Probabilistic Bag-to-Class Approach to Multiple-Instance Learning |
title_sort | probabilistic bag to class approach to multiple instance learning |
topic | image classification multi-instance learning divergence dissimilarity bag-to-class Kullback–Leibler |
url | https://www.mdpi.com/2306-5729/5/2/56 |
work_keys_str_mv | AT kajsamøllersen aprobabilisticbagtoclassapproachtomultipleinstancelearning AT jonyngvehardeberg aprobabilisticbagtoclassapproachtomultipleinstancelearning AT fredgodtliebsen aprobabilisticbagtoclassapproachtomultipleinstancelearning AT kajsamøllersen probabilisticbagtoclassapproachtomultipleinstancelearning AT jonyngvehardeberg probabilisticbagtoclassapproachtomultipleinstancelearning AT fredgodtliebsen probabilisticbagtoclassapproachtomultipleinstancelearning |