AN EFFICIENT CLUSTERING METHOD FOR DBSCAN GEOGRAPHIC SPATIO-TEMPORAL LARGE DATA WITH IMPROVED PARAMETER OPTIMIZATION

How to establish an effective method of large data analysis of geographic space-time and quickly and accurately find the hidden value behind geographic information has become a current research focus. Researchers have found that clustering analysis methods in data mining field can well mine knowledg...

Full description

Bibliographic Details
Main Authors: J. W. Li, X. Q. Han, J. W. Jiang, Y. Hu, L. Liu
Format: Article
Language:English
Published: Copernicus Publications 2020-02-01
Series:The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
Online Access:https://www.int-arch-photogramm-remote-sens-spatial-inf-sci.net/XLII-3-W10/581/2020/isprs-archives-XLII-3-W10-581-2020.pdf
_version_ 1818507006704615424
author J. W. Li
J. W. Li
X. Q. Han
X. Q. Han
J. W. Jiang
J. W. Jiang
Y. Hu
Y. Hu
L. Liu
L. Liu
author_facet J. W. Li
J. W. Li
X. Q. Han
X. Q. Han
J. W. Jiang
J. W. Jiang
Y. Hu
Y. Hu
L. Liu
L. Liu
author_sort J. W. Li
collection DOAJ
description How to establish an effective method of large data analysis of geographic space-time and quickly and accurately find the hidden value behind geographic information has become a current research focus. Researchers have found that clustering analysis methods in data mining field can well mine knowledge and information hidden in complex and massive spatio-temporal data, and density-based clustering is one of the most important clustering methods.However, the traditional DBSCAN clustering algorithm has some drawbacks which are difficult to overcome in parameter selection. For example, the two important parameters of Eps neighborhood and MinPts density need to be set artificially. If the clustering results are reasonable, the more suitable parameters can not be selected according to the guiding principles of parameter setting of traditional DBSCAN clustering algorithm. It can not produce accurate clustering results.To solve the problem of misclassification and density sparsity caused by unreasonable parameter selection in DBSCAN clustering algorithm. In this paper, a DBSCAN-based data efficient density clustering method with improved parameter optimization is proposed. Its evaluation index function (Optimal Distance) is obtained by cycling k-clustering in turn, and the optimal solution is selected. The optimal k-value in k-clustering is used to cluster samples. Through mathematical and physical analysis, we can determine the appropriate parameters of Eps and MinPts. Finally, we can get clustering results by DBSCAN clustering. Experiments show that this method can select parameters reasonably for DBSCAN clustering, which proves the superiority of the method described in this paper.
first_indexed 2024-12-10T22:12:38Z
format Article
id doaj.art-56dfefb2ebe44c01b416aee4b52c79ff
institution Directory Open Access Journal
issn 1682-1750
2194-9034
language English
last_indexed 2024-12-10T22:12:38Z
publishDate 2020-02-01
publisher Copernicus Publications
record_format Article
series The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
spelling doaj.art-56dfefb2ebe44c01b416aee4b52c79ff2022-12-22T01:31:34ZengCopernicus PublicationsThe International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences1682-17502194-90342020-02-01XLII-3-W1058158410.5194/isprs-archives-XLII-3-W10-581-2020AN EFFICIENT CLUSTERING METHOD FOR DBSCAN GEOGRAPHIC SPATIO-TEMPORAL LARGE DATA WITH IMPROVED PARAMETER OPTIMIZATIONJ. W. Li0J. W. Li1X. Q. Han2X. Q. Han3J. W. Jiang4J. W. Jiang5Y. Hu6Y. Hu7L. Liu8L. Liu9Guangxi Key Laboratory of Spatial Information and Geomatics, Guilin University of Technology, Guilin 541004, ChinaGuilin University of Technology, Guilin 541004, ChinaGuangxi Key Laboratory of Spatial Information and Geomatics, Guilin University of Technology, Guilin 541004, ChinaGuilin University of Technology, Guilin 541004, ChinaGuangxi Key Laboratory of Spatial Information and Geomatics, Guilin University of Technology, Guilin 541004, ChinaGuilin University of Technology, Guilin 541004, ChinaGuangxi Key Laboratory of Spatial Information and Geomatics, Guilin University of Technology, Guilin 541004, ChinaGuilin University of Technology, Guilin 541004, ChinaGuangxi Key Laboratory of Spatial Information and Geomatics, Guilin University of Technology, Guilin 541004, ChinaGuilin University of Technology, Guilin 541004, ChinaHow to establish an effective method of large data analysis of geographic space-time and quickly and accurately find the hidden value behind geographic information has become a current research focus. Researchers have found that clustering analysis methods in data mining field can well mine knowledge and information hidden in complex and massive spatio-temporal data, and density-based clustering is one of the most important clustering methods.However, the traditional DBSCAN clustering algorithm has some drawbacks which are difficult to overcome in parameter selection. For example, the two important parameters of Eps neighborhood and MinPts density need to be set artificially. If the clustering results are reasonable, the more suitable parameters can not be selected according to the guiding principles of parameter setting of traditional DBSCAN clustering algorithm. It can not produce accurate clustering results.To solve the problem of misclassification and density sparsity caused by unreasonable parameter selection in DBSCAN clustering algorithm. In this paper, a DBSCAN-based data efficient density clustering method with improved parameter optimization is proposed. Its evaluation index function (Optimal Distance) is obtained by cycling k-clustering in turn, and the optimal solution is selected. The optimal k-value in k-clustering is used to cluster samples. Through mathematical and physical analysis, we can determine the appropriate parameters of Eps and MinPts. Finally, we can get clustering results by DBSCAN clustering. Experiments show that this method can select parameters reasonably for DBSCAN clustering, which proves the superiority of the method described in this paper.https://www.int-arch-photogramm-remote-sens-spatial-inf-sci.net/XLII-3-W10/581/2020/isprs-archives-XLII-3-W10-581-2020.pdf
spellingShingle J. W. Li
J. W. Li
X. Q. Han
X. Q. Han
J. W. Jiang
J. W. Jiang
Y. Hu
Y. Hu
L. Liu
L. Liu
AN EFFICIENT CLUSTERING METHOD FOR DBSCAN GEOGRAPHIC SPATIO-TEMPORAL LARGE DATA WITH IMPROVED PARAMETER OPTIMIZATION
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
title AN EFFICIENT CLUSTERING METHOD FOR DBSCAN GEOGRAPHIC SPATIO-TEMPORAL LARGE DATA WITH IMPROVED PARAMETER OPTIMIZATION
title_full AN EFFICIENT CLUSTERING METHOD FOR DBSCAN GEOGRAPHIC SPATIO-TEMPORAL LARGE DATA WITH IMPROVED PARAMETER OPTIMIZATION
title_fullStr AN EFFICIENT CLUSTERING METHOD FOR DBSCAN GEOGRAPHIC SPATIO-TEMPORAL LARGE DATA WITH IMPROVED PARAMETER OPTIMIZATION
title_full_unstemmed AN EFFICIENT CLUSTERING METHOD FOR DBSCAN GEOGRAPHIC SPATIO-TEMPORAL LARGE DATA WITH IMPROVED PARAMETER OPTIMIZATION
title_short AN EFFICIENT CLUSTERING METHOD FOR DBSCAN GEOGRAPHIC SPATIO-TEMPORAL LARGE DATA WITH IMPROVED PARAMETER OPTIMIZATION
title_sort efficient clustering method for dbscan geographic spatio temporal large data with improved parameter optimization
url https://www.int-arch-photogramm-remote-sens-spatial-inf-sci.net/XLII-3-W10/581/2020/isprs-archives-XLII-3-W10-581-2020.pdf
work_keys_str_mv AT jwli anefficientclusteringmethodfordbscangeographicspatiotemporallargedatawithimprovedparameteroptimization
AT jwli anefficientclusteringmethodfordbscangeographicspatiotemporallargedatawithimprovedparameteroptimization
AT xqhan anefficientclusteringmethodfordbscangeographicspatiotemporallargedatawithimprovedparameteroptimization
AT xqhan anefficientclusteringmethodfordbscangeographicspatiotemporallargedatawithimprovedparameteroptimization
AT jwjiang anefficientclusteringmethodfordbscangeographicspatiotemporallargedatawithimprovedparameteroptimization
AT jwjiang anefficientclusteringmethodfordbscangeographicspatiotemporallargedatawithimprovedparameteroptimization
AT yhu anefficientclusteringmethodfordbscangeographicspatiotemporallargedatawithimprovedparameteroptimization
AT yhu anefficientclusteringmethodfordbscangeographicspatiotemporallargedatawithimprovedparameteroptimization
AT lliu anefficientclusteringmethodfordbscangeographicspatiotemporallargedatawithimprovedparameteroptimization
AT lliu anefficientclusteringmethodfordbscangeographicspatiotemporallargedatawithimprovedparameteroptimization
AT jwli efficientclusteringmethodfordbscangeographicspatiotemporallargedatawithimprovedparameteroptimization
AT jwli efficientclusteringmethodfordbscangeographicspatiotemporallargedatawithimprovedparameteroptimization
AT xqhan efficientclusteringmethodfordbscangeographicspatiotemporallargedatawithimprovedparameteroptimization
AT xqhan efficientclusteringmethodfordbscangeographicspatiotemporallargedatawithimprovedparameteroptimization
AT jwjiang efficientclusteringmethodfordbscangeographicspatiotemporallargedatawithimprovedparameteroptimization
AT jwjiang efficientclusteringmethodfordbscangeographicspatiotemporallargedatawithimprovedparameteroptimization
AT yhu efficientclusteringmethodfordbscangeographicspatiotemporallargedatawithimprovedparameteroptimization
AT yhu efficientclusteringmethodfordbscangeographicspatiotemporallargedatawithimprovedparameteroptimization
AT lliu efficientclusteringmethodfordbscangeographicspatiotemporallargedatawithimprovedparameteroptimization
AT lliu efficientclusteringmethodfordbscangeographicspatiotemporallargedatawithimprovedparameteroptimization