HIGHLY ROBUST METHODS IN DATA MINING

This paper is devoted to highly robust methods for information extraction from data, with a special attention paid to methods suitable for management applications. The sensitivity of availabledata mining methods to the presence of outlying measurements in the observed data is discussed as a major dr...

Full description

Bibliographic Details
Main Author: Jan Kalina
Format: Article
Language:English
Published: University in Belgrade 2013-05-01
Series:Serbian Journal of Management
Subjects:
Online Access:http://www.sjm06.com/SJM%20ISSN1452-4864/8_1_2013_May_1_132/8_1_2013_9-24.pdf
Description
Summary:This paper is devoted to highly robust methods for information extraction from data, with a special attention paid to methods suitable for management applications. The sensitivity of availabledata mining methods to the presence of outlying measurements in the observed data is discussed as a major drawback of available data mining methods. The paper proposes several newhighly robustmethods for data mining, which are based on the idea of implicit weighting of individual data values.Particularly it propose a novel robust method of hierarchical cluster analysis, which is a popular data mining method of unsupervised learning. Further, a robust method for estimating parameters in thelogistic regression was proposed. This idea is extended to a robust multinomial logistic classification analysis. Finally, the sensitivity of neural networks to the presence of noise and outlying measurements in the data was discussed. The method for robust training of neural networks for the task of function approximation, which has the form of a robust estimator in nonlinear regression, was proposed.
ISSN:1452-4864