A Probabilistic Bag-to-Class Approach to Multiple-Instance Learning

Multi-instance (MI) learning is a branch of machine learning, where each object (bag) consists of multiple feature vectors (instances)—for example, an image consisting of multiple patches and their corresponding feature vectors. In MI classification, each bag in the training set has a class label, b...

Full description

Bibliographic Details
Main Authors: Kajsa Møllersen, Jon Yngve Hardeberg, Fred Godtliebsen
Format: Article
Language:English
Published: MDPI AG 2020-06-01
Series:Data
Subjects:
Online Access:https://www.mdpi.com/2306-5729/5/2/56
_version_ 1797564075855577088
author Kajsa Møllersen
Jon Yngve Hardeberg
Fred Godtliebsen
author_facet Kajsa Møllersen
Jon Yngve Hardeberg
Fred Godtliebsen
author_sort Kajsa Møllersen
collection DOAJ
description Multi-instance (MI) learning is a branch of machine learning, where each object (bag) consists of multiple feature vectors (instances)—for example, an image consisting of multiple patches and their corresponding feature vectors. In MI classification, each bag in the training set has a class label, but the instances are unlabeled. The instances are most commonly regarded as a set of points in a multi-dimensional space. Alternatively, instances are viewed as realizations of random vectors with corresponding probability distribution, where the bag is the distribution, not the realizations. By introducing the probability distribution space to bag-level classification problems, dissimilarities between probability distributions (divergences) can be applied. The bag-to-bag Kullback–Leibler information is asymptotically the best classifier, but the typical sparseness of MI training sets is an obstacle. We introduce bag-to-class divergence to MI learning, emphasizing the hierarchical nature of the random vectors that makes bags from the same class different. We propose two properties for bag-to-class divergences, and an additional property for sparse training sets, and propose a dissimilarity measure that fulfils them. Its performance is demonstrated on synthetic and real data. The probability distribution space is valid for MI learning, both for the theoretical analysis and applications.
first_indexed 2024-03-10T18:52:18Z
format Article
id doaj.art-60bd907152734382bf21501f0ca5e065
institution Directory Open Access Journal
issn 2306-5729
language English
last_indexed 2024-03-10T18:52:18Z
publishDate 2020-06-01
publisher MDPI AG
record_format Article
series Data
spelling doaj.art-60bd907152734382bf21501f0ca5e0652023-11-20T05:00:33ZengMDPI AGData2306-57292020-06-01525610.3390/data5020056A Probabilistic Bag-to-Class Approach to Multiple-Instance LearningKajsa Møllersen0Jon Yngve Hardeberg1Fred Godtliebsen2Department of Community Medicine, Faculty of Health Science, UiT The Arctic University of Norway, N-9037 Tromsø, NorwayDepartment of Computer Science, Faculty of Information Technology and Electrical Engineering, NTNU—Norwegian University of Science and Technology, N-2815 Gjøvik, NorwayDepartment of Mathematics and Statistics, Faculty of Science and Technology, UiT The Arctic University of Norway, N-9037 Tromsø, NorwayMulti-instance (MI) learning is a branch of machine learning, where each object (bag) consists of multiple feature vectors (instances)—for example, an image consisting of multiple patches and their corresponding feature vectors. In MI classification, each bag in the training set has a class label, but the instances are unlabeled. The instances are most commonly regarded as a set of points in a multi-dimensional space. Alternatively, instances are viewed as realizations of random vectors with corresponding probability distribution, where the bag is the distribution, not the realizations. By introducing the probability distribution space to bag-level classification problems, dissimilarities between probability distributions (divergences) can be applied. The bag-to-bag Kullback–Leibler information is asymptotically the best classifier, but the typical sparseness of MI training sets is an obstacle. We introduce bag-to-class divergence to MI learning, emphasizing the hierarchical nature of the random vectors that makes bags from the same class different. We propose two properties for bag-to-class divergences, and an additional property for sparse training sets, and propose a dissimilarity measure that fulfils them. Its performance is demonstrated on synthetic and real data. The probability distribution space is valid for MI learning, both for the theoretical analysis and applications.https://www.mdpi.com/2306-5729/5/2/56image classificationmulti-instance learningdivergencedissimilaritybag-to-classKullback–Leibler
spellingShingle Kajsa Møllersen
Jon Yngve Hardeberg
Fred Godtliebsen
A Probabilistic Bag-to-Class Approach to Multiple-Instance Learning
Data
image classification
multi-instance learning
divergence
dissimilarity
bag-to-class
Kullback–Leibler
title A Probabilistic Bag-to-Class Approach to Multiple-Instance Learning
title_full A Probabilistic Bag-to-Class Approach to Multiple-Instance Learning
title_fullStr A Probabilistic Bag-to-Class Approach to Multiple-Instance Learning
title_full_unstemmed A Probabilistic Bag-to-Class Approach to Multiple-Instance Learning
title_short A Probabilistic Bag-to-Class Approach to Multiple-Instance Learning
title_sort probabilistic bag to class approach to multiple instance learning
topic image classification
multi-instance learning
divergence
dissimilarity
bag-to-class
Kullback–Leibler
url https://www.mdpi.com/2306-5729/5/2/56
work_keys_str_mv AT kajsamøllersen aprobabilisticbagtoclassapproachtomultipleinstancelearning
AT jonyngvehardeberg aprobabilisticbagtoclassapproachtomultipleinstancelearning
AT fredgodtliebsen aprobabilisticbagtoclassapproachtomultipleinstancelearning
AT kajsamøllersen probabilisticbagtoclassapproachtomultipleinstancelearning
AT jonyngvehardeberg probabilisticbagtoclassapproachtomultipleinstancelearning
AT fredgodtliebsen probabilisticbagtoclassapproachtomultipleinstancelearning