Collecting Typhoon Disaster Information from Twitter Based on Query Expansion

Social media is a popular source of volunteered geographic information owing to its massive real-time data; however, the use of social media data in the context of geospatial analysis is challenging because complex semantic filters are required for the aggregation of geographic messages from the dat...

Full description

Bibliographic Details
Main Authors: Zi Chen, Samsung Lim
Format: Article
Language:English
Published: MDPI AG 2018-04-01
Series:ISPRS International Journal of Geo-Information
Subjects:
Online Access:http://www.mdpi.com/2220-9964/7/4/139
_version_ 1818234425375195136
author Zi Chen
Samsung Lim
author_facet Zi Chen
Samsung Lim
author_sort Zi Chen
collection DOAJ
description Social media is a popular source of volunteered geographic information owing to its massive real-time data; however, the use of social media data in the context of geospatial analysis is challenging because complex semantic filters are required for the aggregation of geographic messages from the data streams. This article proposes a new query expansion method for social media streams which updates the query keywords periodically by the words extracted from the preceding search results. The proposed method has optimized the trade-off between precision and coverage of geographical messages by factoring in the influences of the keyword number and refresh cycle in the query process, and some improvements on the classic Term Frequency-Inverse Document Frequency (TF-IDF) method for short texts were achieved. Furthermore, a number of filters based upon relevance to the target topic were established and tested. This method was tested on a dataset from Twitter within the geographic extent of Macau in August 2017 during two consecutive typhoon hits. The result supports its effectiveness with a controllable precision and considerable increment of relevant information. Moreover, the query keywords can adjust themselves to the local language environment by discovering new keywords. To conclude, this query expansion method is able to provide a reliable method for social media-based information retrieval.
first_indexed 2024-12-12T11:37:52Z
format Article
id doaj.art-7442fe87abec450baefe28c045d7ad7c
institution Directory Open Access Journal
issn 2220-9964
language English
last_indexed 2024-12-12T11:37:52Z
publishDate 2018-04-01
publisher MDPI AG
record_format Article
series ISPRS International Journal of Geo-Information
spelling doaj.art-7442fe87abec450baefe28c045d7ad7c2022-12-22T00:25:36ZengMDPI AGISPRS International Journal of Geo-Information2220-99642018-04-017413910.3390/ijgi7040139ijgi7040139Collecting Typhoon Disaster Information from Twitter Based on Query ExpansionZi Chen0Samsung Lim1School of Civil and Environmental Engineering, University of New South Wales, Sydney, NSW 2052, AustraliaSchool of Civil and Environmental Engineering, University of New South Wales, Sydney, NSW 2052, AustraliaSocial media is a popular source of volunteered geographic information owing to its massive real-time data; however, the use of social media data in the context of geospatial analysis is challenging because complex semantic filters are required for the aggregation of geographic messages from the data streams. This article proposes a new query expansion method for social media streams which updates the query keywords periodically by the words extracted from the preceding search results. The proposed method has optimized the trade-off between precision and coverage of geographical messages by factoring in the influences of the keyword number and refresh cycle in the query process, and some improvements on the classic Term Frequency-Inverse Document Frequency (TF-IDF) method for short texts were achieved. Furthermore, a number of filters based upon relevance to the target topic were established and tested. This method was tested on a dataset from Twitter within the geographic extent of Macau in August 2017 during two consecutive typhoon hits. The result supports its effectiveness with a controllable precision and considerable increment of relevant information. Moreover, the query keywords can adjust themselves to the local language environment by discovering new keywords. To conclude, this query expansion method is able to provide a reliable method for social media-based information retrieval.http://www.mdpi.com/2220-9964/7/4/139information retrievalsocial mediatyphoonquery expansion
spellingShingle Zi Chen
Samsung Lim
Collecting Typhoon Disaster Information from Twitter Based on Query Expansion
ISPRS International Journal of Geo-Information
information retrieval
social media
typhoon
query expansion
title Collecting Typhoon Disaster Information from Twitter Based on Query Expansion
title_full Collecting Typhoon Disaster Information from Twitter Based on Query Expansion
title_fullStr Collecting Typhoon Disaster Information from Twitter Based on Query Expansion
title_full_unstemmed Collecting Typhoon Disaster Information from Twitter Based on Query Expansion
title_short Collecting Typhoon Disaster Information from Twitter Based on Query Expansion
title_sort collecting typhoon disaster information from twitter based on query expansion
topic information retrieval
social media
typhoon
query expansion
url http://www.mdpi.com/2220-9964/7/4/139
work_keys_str_mv AT zichen collectingtyphoondisasterinformationfromtwitterbasedonqueryexpansion
AT samsunglim collectingtyphoondisasterinformationfromtwitterbasedonqueryexpansion