BERTCWS: unsupervised multi-granular Chinese word segmentation based on a BERT method for the geoscience domain
ABSTRACTUnlike alphabet-based languages such as English, the Chinese language has no specifying word boundaries. Segmentation, particularly for the Chinese language, is a fundamental step towards Chinese text processing, information retrieval, and knowledge discovery. In the geoscience domain, most...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Taylor & Francis Group
2023-07-01
|
Series: | Annals of GIS |
Subjects: | |
Online Access: | https://www.tandfonline.com/doi/10.1080/19475683.2023.2186487 |