Unsupervised Spelling Correction for Slovak

This paper introduces a method to automatically propose and choose a correction for an incorrectly written word in a large text corpus written in Slovak. This task can be described as a process of finding the best matching sequence of correct words to a list of incorrectly spelled words, found in th...

Full description

Bibliographic Details
Main Authors: Daniel Hladek, Jan Stas, Jozef Juhar
Format: Article
Language:English
Published: VSB-Technical University of Ostrava 2013-01-01
Series:Advances in Electrical and Electronic Engineering
Subjects:
Online Access:http://advances.utc.sk/index.php/AEEE/article/view/898
_version_ 1797827069402415104
author Daniel Hladek
Jan Stas
Jozef Juhar
author_facet Daniel Hladek
Jan Stas
Jozef Juhar
author_sort Daniel Hladek
collection DOAJ
description This paper introduces a method to automatically propose and choose a correction for an incorrectly written word in a large text corpus written in Slovak. This task can be described as a process of finding the best matching sequence of correct words to a list of incorrectly spelled words, found in the input. Knowledge base of the classification system - statistics about sequences of correctly typed words and possible corrections for incorrectly typed words can be mathematically described as a hidden Markov model. The best matching sequence of correct words is found using Viterbi algorithm. The system will be evaluated on a manually corrected testing set.
first_indexed 2024-04-09T12:42:20Z
format Article
id doaj.art-db6bc109eaeb4ab0a87f2bb6ecd7e7c5
institution Directory Open Access Journal
issn 1336-1376
1804-3119
language English
last_indexed 2024-04-09T12:42:20Z
publishDate 2013-01-01
publisher VSB-Technical University of Ostrava
record_format Article
series Advances in Electrical and Electronic Engineering
spelling doaj.art-db6bc109eaeb4ab0a87f2bb6ecd7e7c52023-05-14T20:50:08ZengVSB-Technical University of OstravaAdvances in Electrical and Electronic Engineering1336-13761804-31192013-01-0111539239710.15598/aeee.v11i5.898617Unsupervised Spelling Correction for SlovakDaniel Hladek0Jan StasJozef JuharDepartment of Electronics and Multimedia Communications Faculty of Electrical Engineering Technical University of Kosice Park Komenskeho 13 042 00 Kosice Slovak RepublicThis paper introduces a method to automatically propose and choose a correction for an incorrectly written word in a large text corpus written in Slovak. This task can be described as a process of finding the best matching sequence of correct words to a list of incorrectly spelled words, found in the input. Knowledge base of the classification system - statistics about sequences of correctly typed words and possible corrections for incorrectly typed words can be mathematically described as a hidden Markov model. The best matching sequence of correct words is found using Viterbi algorithm. The system will be evaluated on a manually corrected testing set.http://advances.utc.sk/index.php/AEEE/article/view/898automatic spelling correctionhidden markov modelnatural language processing.
spellingShingle Daniel Hladek
Jan Stas
Jozef Juhar
Unsupervised Spelling Correction for Slovak
Advances in Electrical and Electronic Engineering
automatic spelling correction
hidden markov model
natural language processing.
title Unsupervised Spelling Correction for Slovak
title_full Unsupervised Spelling Correction for Slovak
title_fullStr Unsupervised Spelling Correction for Slovak
title_full_unstemmed Unsupervised Spelling Correction for Slovak
title_short Unsupervised Spelling Correction for Slovak
title_sort unsupervised spelling correction for slovak
topic automatic spelling correction
hidden markov model
natural language processing.
url http://advances.utc.sk/index.php/AEEE/article/view/898
work_keys_str_mv AT danielhladek unsupervisedspellingcorrectionforslovak
AT janstas unsupervisedspellingcorrectionforslovak
AT jozefjuhar unsupervisedspellingcorrectionforslovak