Three Similarity Measures between One-Dimensional DataSets

Based on an interval distance, three functions are given in order to quantify similarities between one-dimensional data sets by using first-order statistics. The Glass Identification Database is used to illustrate how to analyse a data set prior to its classification and/or to exclude dimensions. Fu...

Full description

Bibliographic Details
Main Authors: LUIS GONZALEZ-ABRIL, JOSE M. GAVILAN, FRANCISCO VELASCO MORENTE
Format: Article
Language:English
Published: Universidad Nacional de Colombia 2014-06-01
Series:Revista Colombiana de Estadística
Subjects:
Online Access:http://www.scielo.org.co/scielo.php?script=sci_arttext&pid=S0120-17512014000100006&lng=en&tlng=en
_version_ 1818082996903739392
author LUIS GONZALEZ-ABRIL
JOSE M. GAVILAN
FRANCISCO VELASCO MORENTE
author_facet LUIS GONZALEZ-ABRIL
JOSE M. GAVILAN
FRANCISCO VELASCO MORENTE
author_sort LUIS GONZALEZ-ABRIL
collection DOAJ
description Based on an interval distance, three functions are given in order to quantify similarities between one-dimensional data sets by using first-order statistics. The Glass Identification Database is used to illustrate how to analyse a data set prior to its classification and/or to exclude dimensions. Furthermore, a non-parametric hypothesis test is designed to show how these similarity measures, based on random samples from two populations, can be used to decide whether these populations are identical. Two comparative analyses are also carried out with a parametric test and a non-parametric test. This new non-parametric test performs reasonably well in comparison with classic tests.
first_indexed 2024-12-10T19:30:59Z
format Article
id doaj.art-caf92c80fd9043a595ec5b8aca67318e
institution Directory Open Access Journal
issn 0120-1751
language English
last_indexed 2024-12-10T19:30:59Z
publishDate 2014-06-01
publisher Universidad Nacional de Colombia
record_format Article
series Revista Colombiana de Estadística
spelling doaj.art-caf92c80fd9043a595ec5b8aca67318e2022-12-22T01:36:15ZengUniversidad Nacional de ColombiaRevista Colombiana de Estadística0120-17512014-06-01371799410.15446/rce.v37n1.44359S0120-17512014000100006Three Similarity Measures between One-Dimensional DataSetsLUIS GONZALEZ-ABRIL0JOSE M. GAVILAN1FRANCISCO VELASCO MORENTE2Universidad de SevillaUniversidad de SevillaUniversidad de SevillaBased on an interval distance, three functions are given in order to quantify similarities between one-dimensional data sets by using first-order statistics. The Glass Identification Database is used to illustrate how to analyse a data set prior to its classification and/or to exclude dimensions. Furthermore, a non-parametric hypothesis test is designed to show how these similarity measures, based on random samples from two populations, can be used to decide whether these populations are identical. Two comparative analyses are also carried out with a parametric test and a non-parametric test. This new non-parametric test performs reasonably well in comparison with classic tests.http://www.scielo.org.co/scielo.php?script=sci_arttext&pid=S0120-17512014000100006&lng=en&tlng=endistancia entre intervalosmétodos del núcleominería de datostests no paramétricos
spellingShingle LUIS GONZALEZ-ABRIL
JOSE M. GAVILAN
FRANCISCO VELASCO MORENTE
Three Similarity Measures between One-Dimensional DataSets
Revista Colombiana de Estadística
distancia entre intervalos
métodos del núcleo
minería de datos
tests no paramétricos
title Three Similarity Measures between One-Dimensional DataSets
title_full Three Similarity Measures between One-Dimensional DataSets
title_fullStr Three Similarity Measures between One-Dimensional DataSets
title_full_unstemmed Three Similarity Measures between One-Dimensional DataSets
title_short Three Similarity Measures between One-Dimensional DataSets
title_sort three similarity measures between one dimensional datasets
topic distancia entre intervalos
métodos del núcleo
minería de datos
tests no paramétricos
url http://www.scielo.org.co/scielo.php?script=sci_arttext&pid=S0120-17512014000100006&lng=en&tlng=en
work_keys_str_mv AT luisgonzalezabril threesimilaritymeasuresbetweenonedimensionaldatasets
AT josemgavilan threesimilaritymeasuresbetweenonedimensionaldatasets
AT franciscovelascomorente threesimilaritymeasuresbetweenonedimensionaldatasets