rEMM: Extensible Markov Model for Data Stream Clustering in R

Clustering streams of continuously arriving data has become an important application of data mining in recent years and efficient algorithms have been proposed by several researchers. However, clustering alone neglects the fact that data in a data stream is not only characterized by the proximity of...

Full description

Bibliographic Details
Main Authors: Michael Hahsler, Margaret H. Dunham
Format: Article
Language:English
Published: Foundation for Open Access Statistics 2010-10-01
Series:Journal of Statistical Software
Subjects:
Online Access:http://www.jstatsoft.org/v35/i05/paper
Description
Summary:Clustering streams of continuously arriving data has become an important application of data mining in recent years and efficient algorithms have been proposed by several researchers. However, clustering alone neglects the fact that data in a data stream is not only characterized by the proximity of data points which is used by clustering, but also by a temporal component. The extensible Markov model (EMM) adds the temporal component to data stream clustering by superimposing a dynamically adapting Markov chain. In this paper we introduce the implementation of the <b>R</b> extension package <b>rEMM</b> which implements EMM and we discuss some examples and applications.
ISSN:1548-7660