Improved morphological decomposition for Arabic broadcast news transcription

In this paper, we show the progress for Arabic speech recognition by incorporating contextual information into the process of morphological decomposition. The new approach achieves lower out-of-vocabulary and word error rates when compared to our previous work, in which the morphological decompositi...

Full description

Bibliographic Details
Main Authors: Ng, Tim, Nguyen, Kham, Zbib, Rabih M., Nguyen, Long
Other Authors: Massachusetts Institute of Technology. Department of Civil and Environmental Engineering
Format: Article
Language:en_US
Published: Institute of Electrical and Electronics Engineers 2010
Online Access:http://hdl.handle.net/1721.1/60032
_version_ 1811093256744730624
author Ng, Tim
Nguyen, Kham
Zbib, Rabih M.
Nguyen, Long
author2 Massachusetts Institute of Technology. Department of Civil and Environmental Engineering
author_facet Massachusetts Institute of Technology. Department of Civil and Environmental Engineering
Ng, Tim
Nguyen, Kham
Zbib, Rabih M.
Nguyen, Long
author_sort Ng, Tim
collection MIT
description In this paper, we show the progress for Arabic speech recognition by incorporating contextual information into the process of morphological decomposition. The new approach achieves lower out-of-vocabulary and word error rates when compared to our previous work, in which the morphological decomposition relies on word-level information only. We also describe how the vocalization procedure is improved to produce pronunciations for some dialect Arabic words. By using the new approach, we reduced the word error by 0.8% absolute (4.7% relative) when compared to the baseline approach.
first_indexed 2024-09-23T15:42:21Z
format Article
id mit-1721.1/60032
institution Massachusetts Institute of Technology
language en_US
last_indexed 2024-09-23T15:42:21Z
publishDate 2010
publisher Institute of Electrical and Electronics Engineers
record_format dspace
spelling mit-1721.1/600322024-06-28T12:39:36Z Improved morphological decomposition for Arabic broadcast news transcription Ng, Tim Nguyen, Kham Zbib, Rabih M. Nguyen, Long Massachusetts Institute of Technology. Department of Civil and Environmental Engineering Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Zbib, Rabih M. Zbib, Rabih M. In this paper, we show the progress for Arabic speech recognition by incorporating contextual information into the process of morphological decomposition. The new approach achieves lower out-of-vocabulary and word error rates when compared to our previous work, in which the morphological decomposition relies on word-level information only. We also describe how the vocalization procedure is improved to produce pronunciations for some dialect Arabic words. By using the new approach, we reduced the word error by 0.8% absolute (4.7% relative) when compared to the baseline approach. United States. Defense Advanced Research Projects Agency (DARPA). GALE program (Contract No. HR0011-06-C-0022) 2010-11-23T19:07:20Z 2010-11-23T19:07:20Z 2009-05 2009-04 Article http://purl.org/eprint/type/ConferencePaper 978-1-4244-2353-8 1520-6149 INSPEC Accession Number: 10700940 http://hdl.handle.net/1721.1/60032 Tim Ng et al. “Improved morphological decomposition for Arabic broadcast news transcription.” Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on. 2009. 4309-4312. © Copyright 2009 IEEE en_US http://dx.doi.org/10.1109/ICASSP.2009.4960582 IEEE International Conference on Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use. application/pdf Institute of Electrical and Electronics Engineers IEEE
spellingShingle Ng, Tim
Nguyen, Kham
Zbib, Rabih M.
Nguyen, Long
Improved morphological decomposition for Arabic broadcast news transcription
title Improved morphological decomposition for Arabic broadcast news transcription
title_full Improved morphological decomposition for Arabic broadcast news transcription
title_fullStr Improved morphological decomposition for Arabic broadcast news transcription
title_full_unstemmed Improved morphological decomposition for Arabic broadcast news transcription
title_short Improved morphological decomposition for Arabic broadcast news transcription
title_sort improved morphological decomposition for arabic broadcast news transcription
url http://hdl.handle.net/1721.1/60032
work_keys_str_mv AT ngtim improvedmorphologicaldecompositionforarabicbroadcastnewstranscription
AT nguyenkham improvedmorphologicaldecompositionforarabicbroadcastnewstranscription
AT zbibrabihm improvedmorphologicaldecompositionforarabicbroadcastnewstranscription
AT nguyenlong improvedmorphologicaldecompositionforarabicbroadcastnewstranscription