A Levenshtein distance-based method for word segmentation in corpus augmentation of geoscience texts
ABSTRACTFor geoscience text, rich domain corpora have become the basis of improving the model performance in word segmentation. However, the lack of domain-specific corpus with annotation labelled has become a major obstacle to professional information mining in geoscience fields. In this paper, we...
Main Authors: | Jinqu Zhang, Lang Qian, Shu Wang, Yunqiang Zhu, Zhenji Gao, Hailong Yu, Weirong Li |
---|---|
Format: | Article |
Language: | English |
Published: |
Taylor & Francis Group
2023-04-01
|
Series: | Annals of GIS |
Subjects: | |
Online Access: | https://www.tandfonline.com/doi/10.1080/19475683.2023.2165543 |
Similar Items
-
Utilization of Augmented and Virtual Reality in Geoscience
by: Věroslav HOLUŠA, et al.
Published: (2022-06-01) -
BERTCWS: unsupervised multi-granular Chinese word segmentation based on a BERT method for the geoscience domain
by: Qinjun Qiu, et al.
Published: (2023-07-01) -
A report on gender diversity and equality in the geosciences: an analysis of the Swiss Geoscience Meetings from 2003 to 2019
by: Francesca Piccoli, et al.
Published: (2021-01-01) -
What Pattern of Progression in Geoscience Fieldwork can be Recognised by Geoscience Educators?
by: Chris J.H. King
Published: (2019-04-01) -
Review of the state of practice in geovisualization in the geosciences
by: Mia Fitzpatrick, et al.
Published: (2024-01-01)