A Varied Density-based Clustering Approach for Event Detection from Heterogeneous Twitter Data

Extracting the latent knowledge from Twitter by applying spatial clustering on geotagged tweets provides the ability to discover events and their locations. DBSCAN (density-based spatial clustering of applications with noise), which has been widely used to retrieve events from geotagged tweets, cann...

Full description

Bibliographic Details
Main Authors: Zeinab Ghaemi, Mahdi Farnaghi
Format: Article
Language:English
Published: MDPI AG 2019-02-01
Series:ISPRS International Journal of Geo-Information
Subjects:
Online Access:https://www.mdpi.com/2220-9964/8/2/82
_version_ 1818764912997957632
author Zeinab Ghaemi
Mahdi Farnaghi
author_facet Zeinab Ghaemi
Mahdi Farnaghi
author_sort Zeinab Ghaemi
collection DOAJ
description Extracting the latent knowledge from Twitter by applying spatial clustering on geotagged tweets provides the ability to discover events and their locations. DBSCAN (density-based spatial clustering of applications with noise), which has been widely used to retrieve events from geotagged tweets, cannot efficiently detect clusters when there is significant spatial heterogeneity in the dataset, as it is the case for Twitter data where the distribution of users, as well as the intensity of publishing tweets, varies over the study areas. This study proposes VDCT (Varied Density-based spatial Clustering for Twitter data) algorithm that extracts clusters from geotagged tweets by considering spatial heterogeneity. The algorithm employs exponential spline interpolation to determine different search radiuses for cluster detection. Moreover, in addition to spatial proximity, textual similarities among tweets are also taken into account by the algorithm. In order to examine the efficiency of the algorithm, geotagged tweets collected during a hurricane in the United States were used for event detection. The output clusters of VDCT have been compared to those of DBSCAN. Visual and quantitative comparison of the results proved the feasibility of the proposed method.
first_indexed 2024-12-18T08:09:45Z
format Article
id doaj.art-03882912f4894b2a86c3ccd657226324
institution Directory Open Access Journal
issn 2220-9964
language English
last_indexed 2024-12-18T08:09:45Z
publishDate 2019-02-01
publisher MDPI AG
record_format Article
series ISPRS International Journal of Geo-Information
spelling doaj.art-03882912f4894b2a86c3ccd6572263242022-12-21T21:14:54ZengMDPI AGISPRS International Journal of Geo-Information2220-99642019-02-01828210.3390/ijgi8020082ijgi8020082A Varied Density-based Clustering Approach for Event Detection from Heterogeneous Twitter DataZeinab Ghaemi0Mahdi Farnaghi1Faculty of Geodesy and Geomatics Engineering, K. N. Toosi University of Technology, Tehran 1996715433, IranFaculty of Geodesy and Geomatics Engineering, K. N. Toosi University of Technology, Tehran 1996715433, IranExtracting the latent knowledge from Twitter by applying spatial clustering on geotagged tweets provides the ability to discover events and their locations. DBSCAN (density-based spatial clustering of applications with noise), which has been widely used to retrieve events from geotagged tweets, cannot efficiently detect clusters when there is significant spatial heterogeneity in the dataset, as it is the case for Twitter data where the distribution of users, as well as the intensity of publishing tweets, varies over the study areas. This study proposes VDCT (Varied Density-based spatial Clustering for Twitter data) algorithm that extracts clusters from geotagged tweets by considering spatial heterogeneity. The algorithm employs exponential spline interpolation to determine different search radiuses for cluster detection. Moreover, in addition to spatial proximity, textual similarities among tweets are also taken into account by the algorithm. In order to examine the efficiency of the algorithm, geotagged tweets collected during a hurricane in the United States were used for event detection. The output clusters of VDCT have been compared to those of DBSCAN. Visual and quantitative comparison of the results proved the feasibility of the proposed method.https://www.mdpi.com/2220-9964/8/2/82spatial clusteringdensity-based clusteringspatial heterogeneitytext Similaritytwitter
spellingShingle Zeinab Ghaemi
Mahdi Farnaghi
A Varied Density-based Clustering Approach for Event Detection from Heterogeneous Twitter Data
ISPRS International Journal of Geo-Information
spatial clustering
density-based clustering
spatial heterogeneity
text Similarity
twitter
title A Varied Density-based Clustering Approach for Event Detection from Heterogeneous Twitter Data
title_full A Varied Density-based Clustering Approach for Event Detection from Heterogeneous Twitter Data
title_fullStr A Varied Density-based Clustering Approach for Event Detection from Heterogeneous Twitter Data
title_full_unstemmed A Varied Density-based Clustering Approach for Event Detection from Heterogeneous Twitter Data
title_short A Varied Density-based Clustering Approach for Event Detection from Heterogeneous Twitter Data
title_sort varied density based clustering approach for event detection from heterogeneous twitter data
topic spatial clustering
density-based clustering
spatial heterogeneity
text Similarity
twitter
url https://www.mdpi.com/2220-9964/8/2/82
work_keys_str_mv AT zeinabghaemi avarieddensitybasedclusteringapproachforeventdetectionfromheterogeneoustwitterdata
AT mahdifarnaghi avarieddensitybasedclusteringapproachforeventdetectionfromheterogeneoustwitterdata
AT zeinabghaemi varieddensitybasedclusteringapproachforeventdetectionfromheterogeneoustwitterdata
AT mahdifarnaghi varieddensitybasedclusteringapproachforeventdetectionfromheterogeneoustwitterdata