Automatic Word Spacing of Korean Using Syllable and Morpheme
In Korean, spacing is very important to understand the readability and context of sentences. In addition, in the case of natural language processing for Korean, if a sentence with an incorrect spacing is used, the structure of the sentence is changed, which affects performance. In the previous study...
Egile Nagusiak: | , , , |
---|---|
Formatua: | Artikulua |
Hizkuntza: | English |
Argitaratua: |
MDPI AG
2021-01-01
|
Saila: | Applied Sciences |
Gaiak: | |
Sarrera elektronikoa: | https://www.mdpi.com/2076-3417/11/2/626 |
_version_ | 1827602395383726080 |
---|---|
author | Jeong-Myeong Choi Jong-Dae Kim Chan-Young Park Yu-Seop Kim |
author_facet | Jeong-Myeong Choi Jong-Dae Kim Chan-Young Park Yu-Seop Kim |
author_sort | Jeong-Myeong Choi |
collection | DOAJ |
description | In Korean, spacing is very important to understand the readability and context of sentences. In addition, in the case of natural language processing for Korean, if a sentence with an incorrect spacing is used, the structure of the sentence is changed, which affects performance. In the previous study, spacing errors were corrected using n-gram based statistical methods and morphological analyzers, and recently many studies using deep learning have been conducted. In this study, we try to solve the spacing error correction problem using both the syllable-level and morpheme-level. The proposed model uses a structure that combines the convolutional neural network layer that can learn syllable and morphological pattern information in sentences and the bidirectional long short-term memory layer that can learn forward and backward sequence information. When evaluating the performance of the proposed model, the accuracy was evaluated at the syllable-level, and also precision, recall, and f1 score were evaluated at the word-level. As a result of the experiment, it was confirmed that performance was improved from the previous study. |
first_indexed | 2024-03-09T05:17:46Z |
format | Article |
id | doaj.art-1039903b18f34c5fb9955f94fda4742d |
institution | Directory Open Access Journal |
issn | 2076-3417 |
language | English |
last_indexed | 2024-03-09T05:17:46Z |
publishDate | 2021-01-01 |
publisher | MDPI AG |
record_format | Article |
series | Applied Sciences |
spelling | doaj.art-1039903b18f34c5fb9955f94fda4742d2023-12-03T12:43:44ZengMDPI AGApplied Sciences2076-34172021-01-0111262610.3390/app11020626Automatic Word Spacing of Korean Using Syllable and MorphemeJeong-Myeong Choi0Jong-Dae Kim1Chan-Young Park2Yu-Seop Kim3Department of Convergence Software, Hallym University, Chuncheon-si, Gangwon-do 24252, KoreaDepartment of Convergence Software, Hallym University, Chuncheon-si, Gangwon-do 24252, KoreaDepartment of Convergence Software, Hallym University, Chuncheon-si, Gangwon-do 24252, KoreaDepartment of Convergence Software, Hallym University, Chuncheon-si, Gangwon-do 24252, KoreaIn Korean, spacing is very important to understand the readability and context of sentences. In addition, in the case of natural language processing for Korean, if a sentence with an incorrect spacing is used, the structure of the sentence is changed, which affects performance. In the previous study, spacing errors were corrected using n-gram based statistical methods and morphological analyzers, and recently many studies using deep learning have been conducted. In this study, we try to solve the spacing error correction problem using both the syllable-level and morpheme-level. The proposed model uses a structure that combines the convolutional neural network layer that can learn syllable and morphological pattern information in sentences and the bidirectional long short-term memory layer that can learn forward and backward sequence information. When evaluating the performance of the proposed model, the accuracy was evaluated at the syllable-level, and also precision, recall, and f1 score were evaluated at the word-level. As a result of the experiment, it was confirmed that performance was improved from the previous study.https://www.mdpi.com/2076-3417/11/2/626spacing correctionsyllable embeddingmorpheme embeddingconvolutional neural networkbidirectional long short-term memory |
spellingShingle | Jeong-Myeong Choi Jong-Dae Kim Chan-Young Park Yu-Seop Kim Automatic Word Spacing of Korean Using Syllable and Morpheme Applied Sciences spacing correction syllable embedding morpheme embedding convolutional neural network bidirectional long short-term memory |
title | Automatic Word Spacing of Korean Using Syllable and Morpheme |
title_full | Automatic Word Spacing of Korean Using Syllable and Morpheme |
title_fullStr | Automatic Word Spacing of Korean Using Syllable and Morpheme |
title_full_unstemmed | Automatic Word Spacing of Korean Using Syllable and Morpheme |
title_short | Automatic Word Spacing of Korean Using Syllable and Morpheme |
title_sort | automatic word spacing of korean using syllable and morpheme |
topic | spacing correction syllable embedding morpheme embedding convolutional neural network bidirectional long short-term memory |
url | https://www.mdpi.com/2076-3417/11/2/626 |
work_keys_str_mv | AT jeongmyeongchoi automaticwordspacingofkoreanusingsyllableandmorpheme AT jongdaekim automaticwordspacingofkoreanusingsyllableandmorpheme AT chanyoungpark automaticwordspacingofkoreanusingsyllableandmorpheme AT yuseopkim automaticwordspacingofkoreanusingsyllableandmorpheme |