Geographic Named Entity Recognition by Employing Natural Language Processing and an Improved BERT Model

Toponym recognition, or the challenge of detecting place names that have a similar referent, is involved in a number of activities connected to geographical information retrieval and geographical information sciences. This research focuses on recognizing Chinese toponyms from social media communicat...

Full description

Bibliographic Details
Main Authors: Liufeng Tao, Zhong Xie, Dexin Xu, Kai Ma, Qinjun Qiu, Shengyong Pan, Bo Huang
Format: Article
Language:English
Published: MDPI AG 2022-11-01
Series:ISPRS International Journal of Geo-Information
Subjects:
Online Access:https://www.mdpi.com/2220-9964/11/12/598
_version_ 1797457345132888064
author Liufeng Tao
Zhong Xie
Dexin Xu
Kai Ma
Qinjun Qiu
Shengyong Pan
Bo Huang
author_facet Liufeng Tao
Zhong Xie
Dexin Xu
Kai Ma
Qinjun Qiu
Shengyong Pan
Bo Huang
author_sort Liufeng Tao
collection DOAJ
description Toponym recognition, or the challenge of detecting place names that have a similar referent, is involved in a number of activities connected to geographical information retrieval and geographical information sciences. This research focuses on recognizing Chinese toponyms from social media communications. While broad named entity recognition methods are frequently used to locate places, their accuracy is hampered by the many linguistic abnormalities seen in social media posts, such as informal sentence constructions, name abbreviations, and misspellings. In this study, we describe a Chinese toponym identification model based on a hybrid neural network that was created with these linguistic inconsistencies in mind. Our method adds a number of improvements to a standard bidirectional recurrent neural network model to help with location detection in social media messages. We demonstrate the results of a wide-ranging evaluation of the performance of different supervised machine learning methods, which have the natural advantage of avoiding human design features. A set of controlled experiments with four test datasets (one constructed and three public datasets) demonstrates the performance of supervised machine learning that can achieve good results on the task, significantly outperforming seven baseline models.
first_indexed 2024-03-09T16:20:55Z
format Article
id doaj.art-a2cc9de7da8b46a78e4186da90812189
institution Directory Open Access Journal
issn 2220-9964
language English
last_indexed 2024-03-09T16:20:55Z
publishDate 2022-11-01
publisher MDPI AG
record_format Article
series ISPRS International Journal of Geo-Information
spelling doaj.art-a2cc9de7da8b46a78e4186da908121892023-11-24T15:20:58ZengMDPI AGISPRS International Journal of Geo-Information2220-99642022-11-01111259810.3390/ijgi11120598Geographic Named Entity Recognition by Employing Natural Language Processing and an Improved BERT ModelLiufeng Tao0Zhong Xie1Dexin Xu2Kai Ma3Qinjun Qiu4Shengyong Pan5Bo Huang6School of Computer Science, China University of Geosciences, Wuhan 430074, ChinaSchool of Computer Science, China University of Geosciences, Wuhan 430074, ChinaWuhan Geomatics Institute, Wuhan 430074, ChinaHubei Key Laboratory of Intelligent Vision Based Monitoring for Hydroelectric Engineering, China Three Gorges University, Yichang 443002, ChinaSchool of Computer Science, China University of Geosciences, Wuhan 430074, ChinaWuhan Zondy Cyber Science & Technology Co., Ltd., Wuhan 430074, ChinaWuhan Zondy Cyber Science & Technology Co., Ltd., Wuhan 430074, ChinaToponym recognition, or the challenge of detecting place names that have a similar referent, is involved in a number of activities connected to geographical information retrieval and geographical information sciences. This research focuses on recognizing Chinese toponyms from social media communications. While broad named entity recognition methods are frequently used to locate places, their accuracy is hampered by the many linguistic abnormalities seen in social media posts, such as informal sentence constructions, name abbreviations, and misspellings. In this study, we describe a Chinese toponym identification model based on a hybrid neural network that was created with these linguistic inconsistencies in mind. Our method adds a number of improvements to a standard bidirectional recurrent neural network model to help with location detection in social media messages. We demonstrate the results of a wide-ranging evaluation of the performance of different supervised machine learning methods, which have the natural advantage of avoiding human design features. A set of controlled experiments with four test datasets (one constructed and three public datasets) demonstrates the performance of supervised machine learning that can achieve good results on the task, significantly outperforming seven baseline models.https://www.mdpi.com/2220-9964/11/12/598geographic named entity recognitionsocial media messagenatural language processingBERTtoponyms recognition
spellingShingle Liufeng Tao
Zhong Xie
Dexin Xu
Kai Ma
Qinjun Qiu
Shengyong Pan
Bo Huang
Geographic Named Entity Recognition by Employing Natural Language Processing and an Improved BERT Model
ISPRS International Journal of Geo-Information
geographic named entity recognition
social media message
natural language processing
BERT
toponyms recognition
title Geographic Named Entity Recognition by Employing Natural Language Processing and an Improved BERT Model
title_full Geographic Named Entity Recognition by Employing Natural Language Processing and an Improved BERT Model
title_fullStr Geographic Named Entity Recognition by Employing Natural Language Processing and an Improved BERT Model
title_full_unstemmed Geographic Named Entity Recognition by Employing Natural Language Processing and an Improved BERT Model
title_short Geographic Named Entity Recognition by Employing Natural Language Processing and an Improved BERT Model
title_sort geographic named entity recognition by employing natural language processing and an improved bert model
topic geographic named entity recognition
social media message
natural language processing
BERT
toponyms recognition
url https://www.mdpi.com/2220-9964/11/12/598
work_keys_str_mv AT liufengtao geographicnamedentityrecognitionbyemployingnaturallanguageprocessingandanimprovedbertmodel
AT zhongxie geographicnamedentityrecognitionbyemployingnaturallanguageprocessingandanimprovedbertmodel
AT dexinxu geographicnamedentityrecognitionbyemployingnaturallanguageprocessingandanimprovedbertmodel
AT kaima geographicnamedentityrecognitionbyemployingnaturallanguageprocessingandanimprovedbertmodel
AT qinjunqiu geographicnamedentityrecognitionbyemployingnaturallanguageprocessingandanimprovedbertmodel
AT shengyongpan geographicnamedentityrecognitionbyemployingnaturallanguageprocessingandanimprovedbertmodel
AT bohuang geographicnamedentityrecognitionbyemployingnaturallanguageprocessingandanimprovedbertmodel