A decision theoretic approach for segmental classification using Hidden Markov models.

This paper is concerned with statistical methods for the analysis of linear sequence data using Hidden Markov Models (HMMs) where the task is to segment and classify the data according to the underlying hidden state sequence. Such analysis is commonplace in the empirical sciences including genomics,...

Full description

Bibliographic Details
Main Authors: Yau, C, Holmes, C
Format: Working paper
Language:English
Published: Oxford-Man Institute of Quantitative Finance 2009
Description
Summary:This paper is concerned with statistical methods for the analysis of linear sequence data using Hidden Markov Models (HMMs) where the task is to segment and classify the data according to the underlying hidden state sequence. Such analysis is commonplace in the empirical sciences including genomics, finance and speech processing. In particular, we are interested in answering the question: given data y and a statistical model ¼(x, y) of the hidden states x, what shall we report as the prediction ˆx under ¼(x|y)? That is, how should you make a prediction of the underlying states? We demonstrate that traditional approaches such as reporting the most probable state sequence or most probable set of marginal predictions leads, in almost all cases, to sub-optimal performance. We propose a decision theoretic approach using a novel class of Markov loss functions and report ˆx via the principle of minimum expected loss. We demonstrate that the sequence of minimum expected loss under the Markov loss function can be enumerated using dynamic programming methods and that it offers substantial improvements and flexibility over existing techniques. The result is generic and applicable to any probabilistic model on a sequence, such as change point or product partition models.