A dynamic logistic model for combining classifier outputs

Many classification algorithms are designed on the assumption that the population of interest is stationary, i.e. it does not change over time. However, there are many real-world problems where this assumption is not appropriate. In this thesis, we develop a classifier for non-stationary populations...

Full description

Bibliographic Details
Main Authors: Tomas, A, Amber Tomas
Other Authors: Ripley, B
Format: Thesis
Language:English
Published: 2008
Subjects:
_version_ 1826315560835612672
author Tomas, A
Amber Tomas
author2 Ripley, B
author_facet Ripley, B
Tomas, A
Amber Tomas
author_sort Tomas, A
collection OXFORD
description Many classification algorithms are designed on the assumption that the population of interest is stationary, i.e. it does not change over time. However, there are many real-world problems where this assumption is not appropriate. In this thesis, we develop a classifier for non-stationary populations which is based on a multiple logistic model for the conditional class probabilities and incorporates a linear combination of the outputs of a number of pre-determined component classifiers. The final classifier is able to adjust to changes in the population by sequential updating of the coefficients of the linear combination, which are the parameters of the model. The model we use is motivated by the relatively good classification performance which has been achieved by classification rules based on combining classifier outputs. However, in some cases such classifiers can also perform relatively poorly, and in general the mechanisms behind such results are little understood. For the model we propose, which is a generalisation of several existing models for stationary classification problems, we show there exists a simple relationship between the component classifiers which are used, the sign of the parameters and the decision boundaries of the final classifier. This relationship can be used to guide the choice of component classifiers, and helps with understanding the conditions necessary for the classifier to perform well. We compare several "on-line" algorithms for implementing the classification model, where the classifier is updated as new labelled observations become available. The predictive approach to classification is adopted, so each algorithm is based on updating the posterior distribution of the parameters as new information is received. Specifically, we compare a method which assumes the posterior distribution is Gaussian, a more general method developed for the class of Dynamic Generalised Linear Models, and a method based on a sequential Monte Carlo approximation of the posterior. The relationship between the model used for parameter evolution, the bias of the parameter estimates and the error of the classifier is explored.
first_indexed 2024-03-06T18:32:08Z
format Thesis
id oxford-uuid:0a0273fa-9d47-4626-a758-6b5f03722cd0
institution University of Oxford
language English
last_indexed 2024-12-09T03:28:25Z
publishDate 2008
record_format dspace
spelling oxford-uuid:0a0273fa-9d47-4626-a758-6b5f03722cd02024-12-01T11:13:20ZA dynamic logistic model for combining classifier outputsThesishttp://purl.org/coar/resource_type/c_db06uuid:0a0273fa-9d47-4626-a758-6b5f03722cd0Pattern recognition (statistics)EnglishOxford University Research Archive - Valet2008Tomas, AAmber TomasRipley, BMany classification algorithms are designed on the assumption that the population of interest is stationary, i.e. it does not change over time. However, there are many real-world problems where this assumption is not appropriate. In this thesis, we develop a classifier for non-stationary populations which is based on a multiple logistic model for the conditional class probabilities and incorporates a linear combination of the outputs of a number of pre-determined component classifiers. The final classifier is able to adjust to changes in the population by sequential updating of the coefficients of the linear combination, which are the parameters of the model. The model we use is motivated by the relatively good classification performance which has been achieved by classification rules based on combining classifier outputs. However, in some cases such classifiers can also perform relatively poorly, and in general the mechanisms behind such results are little understood. For the model we propose, which is a generalisation of several existing models for stationary classification problems, we show there exists a simple relationship between the component classifiers which are used, the sign of the parameters and the decision boundaries of the final classifier. This relationship can be used to guide the choice of component classifiers, and helps with understanding the conditions necessary for the classifier to perform well. We compare several "on-line" algorithms for implementing the classification model, where the classifier is updated as new labelled observations become available. The predictive approach to classification is adopted, so each algorithm is based on updating the posterior distribution of the parameters as new information is received. Specifically, we compare a method which assumes the posterior distribution is Gaussian, a more general method developed for the class of Dynamic Generalised Linear Models, and a method based on a sequential Monte Carlo approximation of the posterior. The relationship between the model used for parameter evolution, the bias of the parameter estimates and the error of the classifier is explored.
spellingShingle Pattern recognition (statistics)
Tomas, A
Amber Tomas
A dynamic logistic model for combining classifier outputs
title A dynamic logistic model for combining classifier outputs
title_full A dynamic logistic model for combining classifier outputs
title_fullStr A dynamic logistic model for combining classifier outputs
title_full_unstemmed A dynamic logistic model for combining classifier outputs
title_short A dynamic logistic model for combining classifier outputs
title_sort dynamic logistic model for combining classifier outputs
topic Pattern recognition (statistics)
work_keys_str_mv AT tomasa adynamiclogisticmodelforcombiningclassifieroutputs
AT ambertomas adynamiclogisticmodelforcombiningclassifieroutputs
AT tomasa dynamiclogisticmodelforcombiningclassifieroutputs
AT ambertomas dynamiclogisticmodelforcombiningclassifieroutputs