Classifying cuneiform symbols using machine learning algorithms with unigram features on a balanced dataset
Recognizing written languages using symbols written in cuneiform is a tough endeavor due to the lack of information and the challenge of the process of tokenization. The Cuneiform Language Identification (CLI) dataset attempts to understand seven cuneiform languages and dialects, including Sumerian...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
De Gruyter
2023-09-01
|
Series: | Journal of Intelligent Systems |
Subjects: | |
Online Access: | https://doi.org/10.1515/jisys-2023-0087 |