SNEToolkit: Spatial named entities disambiguation toolkit

“Can you tell me where San Jose is located?” “Uh! Do you know that there are more than 1700 locations named San Jose in the world?” The official name of a location is often not the name with which we are familiar. Spatial named entity (SNE) disambiguation is the process of identifying and assigning...

Full description

Bibliographic Details
Main Authors: Rodrique Kafando, Rémy Decoupes, Mathieu Roche, Maguelonne Teisseire
Format: Article
Language:English
Published: Elsevier 2023-07-01
Series:SoftwareX
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2352711023001760
_version_ 1797679457841971200
author Rodrique Kafando
Rémy Decoupes
Mathieu Roche
Maguelonne Teisseire
author_facet Rodrique Kafando
Rémy Decoupes
Mathieu Roche
Maguelonne Teisseire
author_sort Rodrique Kafando
collection DOAJ
description “Can you tell me where San Jose is located?” “Uh! Do you know that there are more than 1700 locations named San Jose in the world?” The official name of a location is often not the name with which we are familiar. Spatial named entity (SNE) disambiguation is the process of identifying and assigning precise coordinates to a place name that can be identified in a text. This task is not always straightforward, especially when the place name in question is ambiguous for various reasons. In this context, we are interested in the disambiguation of spatial named entities that can be identified in a textual document on a country level. The solution that we propose is based on a set of techniques that allow us to disambiguate the spatial entity considering the context in which it is mentioned from a certain number of characteristics that are specific to it. The solution uses as input a textual document and extricates the named entities identified therein while associating them with the correct coordinates. SNE disambiguation is designed to support the process of fast exploration of spatiotemporal data analysis, most often for event tracking. The proposed approach was tested on 1360 SNEs extracted from the GeoVirus dataset. The results show that SNEToolkit outperformed the baseline, the standard Geonames geocoder, with a recall value of 0.911 against a recall value of 0.871 for the baseline. A flexible Python package is provided for end users.
first_indexed 2024-03-11T23:14:57Z
format Article
id doaj.art-d5969d2abfb748038ef8a0da8e201c37
institution Directory Open Access Journal
issn 2352-7110
language English
last_indexed 2024-03-11T23:14:57Z
publishDate 2023-07-01
publisher Elsevier
record_format Article
series SoftwareX
spelling doaj.art-d5969d2abfb748038ef8a0da8e201c372023-09-21T04:37:38ZengElsevierSoftwareX2352-71102023-07-0123101480SNEToolkit: Spatial named entities disambiguation toolkitRodrique Kafando0Rémy Decoupes1Mathieu Roche2Maguelonne Teisseire3TETIS, Univ Montpellier, AgroParisTech, CIRAD, CNRS, INRAE, 500 rue Jean François Breton, Montpellier, 34090, France; CITADEL, Univ. Virtuelle BF, Ouagadougou, Burkina FasoTETIS, Univ Montpellier, AgroParisTech, CIRAD, CNRS, INRAE, 500 rue Jean François Breton, Montpellier, 34090, FranceTETIS, Univ Montpellier, AgroParisTech, CIRAD, CNRS, INRAE, 500 rue Jean François Breton, Montpellier, 34090, France; CIRAD, F-34398 Montpellier, FranceTETIS, Univ Montpellier, AgroParisTech, CIRAD, CNRS, INRAE, 500 rue Jean François Breton, Montpellier, 34090, France; Corresponding author.“Can you tell me where San Jose is located?” “Uh! Do you know that there are more than 1700 locations named San Jose in the world?” The official name of a location is often not the name with which we are familiar. Spatial named entity (SNE) disambiguation is the process of identifying and assigning precise coordinates to a place name that can be identified in a text. This task is not always straightforward, especially when the place name in question is ambiguous for various reasons. In this context, we are interested in the disambiguation of spatial named entities that can be identified in a textual document on a country level. The solution that we propose is based on a set of techniques that allow us to disambiguate the spatial entity considering the context in which it is mentioned from a certain number of characteristics that are specific to it. The solution uses as input a textual document and extricates the named entities identified therein while associating them with the correct coordinates. SNE disambiguation is designed to support the process of fast exploration of spatiotemporal data analysis, most often for event tracking. The proposed approach was tested on 1360 SNEs extracted from the GeoVirus dataset. The results show that SNEToolkit outperformed the baseline, the standard Geonames geocoder, with a recall value of 0.911 against a recall value of 0.871 for the baseline. A flexible Python package is provided for end users.http://www.sciencedirect.com/science/article/pii/S2352711023001760Spatial named entityDisambiguationGeocodingSoftware
spellingShingle Rodrique Kafando
Rémy Decoupes
Mathieu Roche
Maguelonne Teisseire
SNEToolkit: Spatial named entities disambiguation toolkit
SoftwareX
Spatial named entity
Disambiguation
Geocoding
Software
title SNEToolkit: Spatial named entities disambiguation toolkit
title_full SNEToolkit: Spatial named entities disambiguation toolkit
title_fullStr SNEToolkit: Spatial named entities disambiguation toolkit
title_full_unstemmed SNEToolkit: Spatial named entities disambiguation toolkit
title_short SNEToolkit: Spatial named entities disambiguation toolkit
title_sort snetoolkit spatial named entities disambiguation toolkit
topic Spatial named entity
Disambiguation
Geocoding
Software
url http://www.sciencedirect.com/science/article/pii/S2352711023001760
work_keys_str_mv AT rodriquekafando snetoolkitspatialnamedentitiesdisambiguationtoolkit
AT remydecoupes snetoolkitspatialnamedentitiesdisambiguationtoolkit
AT mathieuroche snetoolkitspatialnamedentitiesdisambiguationtoolkit
AT maguelonneteisseire snetoolkitspatialnamedentitiesdisambiguationtoolkit