Using Web Crawler Technology for Text Analysis of Geo-Events: A Case Study of the Huangyan Island Incident
With the social networking and network socialisation have brought more text information and social relationships into our daily lives, the question of whether big data can be fully used to study the phenomenon and discipline of natural sciences has prompted many specialists and scholars to innovate...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Copernicus Publications
2013-11-01
|
Series: | The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences |
Online Access: | http://www.int-arch-photogramm-remote-sens-spatial-inf-sci.net/XL-4-W3/71/2013/isprsarchives-XL-4-W3-71-2013.pdf |
Summary: | With the social networking and network socialisation have brought more text information and social relationships into our daily lives,
the question of whether big data can be fully used to study the phenomenon and discipline of natural sciences has prompted many
specialists and scholars to innovate their research. Though politics were integrally involved in the hyperlinked word issues since
1990s, automatic assembly of different geospatial web and distributed geospatial information systems utilizing service chaining have
explored and built recently, the information collection and data visualisation of geo-events have always faced the bottleneck of
traditional manual analysis because of the sensibility, complexity, relativity, timeliness and unexpected characteristics of political
events. Based on the framework of Heritrix and the analysis of web-based text, word frequency, sentiment tendency and
dissemination path of the Huangyan Island incident is studied here by combining web crawler technology and the text analysis
method. The results indicate that tag cloud, frequency map, attitudes pie, individual mention ratios and dissemination flow graph
based on the data collection and processing not only highlight the subject and theme vocabularies of related topics but also certain
issues and problems behind it. Being able to express the time-space relationship of text information and to disseminate the
information regarding geo-events, the text analysis of network information based on focused web crawler technology can be a tool
for understanding the formation and diffusion of web-based public opinions in political events. |
---|---|
ISSN: | 1682-1750 2194-9034 |