BERTCWS: unsupervised multi-granular Chinese word segmentation based on a BERT method for the geoscience domain

ABSTRACTUnlike alphabet-based languages such as English, the Chinese language has no specifying word boundaries. Segmentation, particularly for the Chinese language, is a fundamental step towards Chinese text processing, information retrieval, and knowledge discovery. In the geoscience domain, most...

Full description

Bibliographic Details
Main Authors: Qinjun Qiu, Zhong Xie, Kai Ma, Miao Tian
Format: Article
Language:English
Published: Taylor & Francis Group 2023-07-01
Series:Annals of GIS
Subjects:
Online Access:https://www.tandfonline.com/doi/10.1080/19475683.2023.2186487