Automatic Word Spacing of Korean Using Syllable and Morpheme

In Korean, spacing is very important to understand the readability and context of sentences. In addition, in the case of natural language processing for Korean, if a sentence with an incorrect spacing is used, the structure of the sentence is changed, which affects performance. In the previous study...

Deskribapen osoa

Xehetasun bibliografikoak
Egile Nagusiak: Jeong-Myeong Choi, Jong-Dae Kim, Chan-Young Park, Yu-Seop Kim
Formatua: Artikulua
Hizkuntza:English
Argitaratua: MDPI AG 2021-01-01
Saila:Applied Sciences
Gaiak:
Sarrera elektronikoa:https://www.mdpi.com/2076-3417/11/2/626
_version_ 1827602395383726080
author Jeong-Myeong Choi
Jong-Dae Kim
Chan-Young Park
Yu-Seop Kim
author_facet Jeong-Myeong Choi
Jong-Dae Kim
Chan-Young Park
Yu-Seop Kim
author_sort Jeong-Myeong Choi
collection DOAJ
description In Korean, spacing is very important to understand the readability and context of sentences. In addition, in the case of natural language processing for Korean, if a sentence with an incorrect spacing is used, the structure of the sentence is changed, which affects performance. In the previous study, spacing errors were corrected using n-gram based statistical methods and morphological analyzers, and recently many studies using deep learning have been conducted. In this study, we try to solve the spacing error correction problem using both the syllable-level and morpheme-level. The proposed model uses a structure that combines the convolutional neural network layer that can learn syllable and morphological pattern information in sentences and the bidirectional long short-term memory layer that can learn forward and backward sequence information. When evaluating the performance of the proposed model, the accuracy was evaluated at the syllable-level, and also precision, recall, and f1 score were evaluated at the word-level. As a result of the experiment, it was confirmed that performance was improved from the previous study.
first_indexed 2024-03-09T05:17:46Z
format Article
id doaj.art-1039903b18f34c5fb9955f94fda4742d
institution Directory Open Access Journal
issn 2076-3417
language English
last_indexed 2024-03-09T05:17:46Z
publishDate 2021-01-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj.art-1039903b18f34c5fb9955f94fda4742d2023-12-03T12:43:44ZengMDPI AGApplied Sciences2076-34172021-01-0111262610.3390/app11020626Automatic Word Spacing of Korean Using Syllable and MorphemeJeong-Myeong Choi0Jong-Dae Kim1Chan-Young Park2Yu-Seop Kim3Department of Convergence Software, Hallym University, Chuncheon-si, Gangwon-do 24252, KoreaDepartment of Convergence Software, Hallym University, Chuncheon-si, Gangwon-do 24252, KoreaDepartment of Convergence Software, Hallym University, Chuncheon-si, Gangwon-do 24252, KoreaDepartment of Convergence Software, Hallym University, Chuncheon-si, Gangwon-do 24252, KoreaIn Korean, spacing is very important to understand the readability and context of sentences. In addition, in the case of natural language processing for Korean, if a sentence with an incorrect spacing is used, the structure of the sentence is changed, which affects performance. In the previous study, spacing errors were corrected using n-gram based statistical methods and morphological analyzers, and recently many studies using deep learning have been conducted. In this study, we try to solve the spacing error correction problem using both the syllable-level and morpheme-level. The proposed model uses a structure that combines the convolutional neural network layer that can learn syllable and morphological pattern information in sentences and the bidirectional long short-term memory layer that can learn forward and backward sequence information. When evaluating the performance of the proposed model, the accuracy was evaluated at the syllable-level, and also precision, recall, and f1 score were evaluated at the word-level. As a result of the experiment, it was confirmed that performance was improved from the previous study.https://www.mdpi.com/2076-3417/11/2/626spacing correctionsyllable embeddingmorpheme embeddingconvolutional neural networkbidirectional long short-term memory
spellingShingle Jeong-Myeong Choi
Jong-Dae Kim
Chan-Young Park
Yu-Seop Kim
Automatic Word Spacing of Korean Using Syllable and Morpheme
Applied Sciences
spacing correction
syllable embedding
morpheme embedding
convolutional neural network
bidirectional long short-term memory
title Automatic Word Spacing of Korean Using Syllable and Morpheme
title_full Automatic Word Spacing of Korean Using Syllable and Morpheme
title_fullStr Automatic Word Spacing of Korean Using Syllable and Morpheme
title_full_unstemmed Automatic Word Spacing of Korean Using Syllable and Morpheme
title_short Automatic Word Spacing of Korean Using Syllable and Morpheme
title_sort automatic word spacing of korean using syllable and morpheme
topic spacing correction
syllable embedding
morpheme embedding
convolutional neural network
bidirectional long short-term memory
url https://www.mdpi.com/2076-3417/11/2/626
work_keys_str_mv AT jeongmyeongchoi automaticwordspacingofkoreanusingsyllableandmorpheme
AT jongdaekim automaticwordspacingofkoreanusingsyllableandmorpheme
AT chanyoungpark automaticwordspacingofkoreanusingsyllableandmorpheme
AT yuseopkim automaticwordspacingofkoreanusingsyllableandmorpheme