Improved morphological decomposition for Arabic broadcast news transcription
In this paper, we show the progress for Arabic speech recognition by incorporating contextual information into the process of morphological decomposition. The new approach achieves lower out-of-vocabulary and word error rates when compared to our previous work, in which the morphological decompositi...
Main Authors: | , , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | en_US |
Published: |
Institute of Electrical and Electronics Engineers
2010
|
Online Access: | http://hdl.handle.net/1721.1/60032 |
_version_ | 1811093256744730624 |
---|---|
author | Ng, Tim Nguyen, Kham Zbib, Rabih M. Nguyen, Long |
author2 | Massachusetts Institute of Technology. Department of Civil and Environmental Engineering |
author_facet | Massachusetts Institute of Technology. Department of Civil and Environmental Engineering Ng, Tim Nguyen, Kham Zbib, Rabih M. Nguyen, Long |
author_sort | Ng, Tim |
collection | MIT |
description | In this paper, we show the progress for Arabic speech recognition by incorporating contextual information into the process of morphological decomposition. The new approach achieves lower out-of-vocabulary and word error rates when compared to our previous work, in which the morphological decomposition relies on word-level information only. We also describe how the vocalization procedure is improved to produce pronunciations for some dialect Arabic words. By using the new approach, we reduced the word error by 0.8% absolute (4.7% relative) when compared to the baseline approach. |
first_indexed | 2024-09-23T15:42:21Z |
format | Article |
id | mit-1721.1/60032 |
institution | Massachusetts Institute of Technology |
language | en_US |
last_indexed | 2024-09-23T15:42:21Z |
publishDate | 2010 |
publisher | Institute of Electrical and Electronics Engineers |
record_format | dspace |
spelling | mit-1721.1/600322024-06-28T12:39:36Z Improved morphological decomposition for Arabic broadcast news transcription Ng, Tim Nguyen, Kham Zbib, Rabih M. Nguyen, Long Massachusetts Institute of Technology. Department of Civil and Environmental Engineering Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Zbib, Rabih M. Zbib, Rabih M. In this paper, we show the progress for Arabic speech recognition by incorporating contextual information into the process of morphological decomposition. The new approach achieves lower out-of-vocabulary and word error rates when compared to our previous work, in which the morphological decomposition relies on word-level information only. We also describe how the vocalization procedure is improved to produce pronunciations for some dialect Arabic words. By using the new approach, we reduced the word error by 0.8% absolute (4.7% relative) when compared to the baseline approach. United States. Defense Advanced Research Projects Agency (DARPA). GALE program (Contract No. HR0011-06-C-0022) 2010-11-23T19:07:20Z 2010-11-23T19:07:20Z 2009-05 2009-04 Article http://purl.org/eprint/type/ConferencePaper 978-1-4244-2353-8 1520-6149 INSPEC Accession Number: 10700940 http://hdl.handle.net/1721.1/60032 Tim Ng et al. “Improved morphological decomposition for Arabic broadcast news transcription.” Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on. 2009. 4309-4312. © Copyright 2009 IEEE en_US http://dx.doi.org/10.1109/ICASSP.2009.4960582 IEEE International Conference on Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use. application/pdf Institute of Electrical and Electronics Engineers IEEE |
spellingShingle | Ng, Tim Nguyen, Kham Zbib, Rabih M. Nguyen, Long Improved morphological decomposition for Arabic broadcast news transcription |
title | Improved morphological decomposition for Arabic broadcast news transcription |
title_full | Improved morphological decomposition for Arabic broadcast news transcription |
title_fullStr | Improved morphological decomposition for Arabic broadcast news transcription |
title_full_unstemmed | Improved morphological decomposition for Arabic broadcast news transcription |
title_short | Improved morphological decomposition for Arabic broadcast news transcription |
title_sort | improved morphological decomposition for arabic broadcast news transcription |
url | http://hdl.handle.net/1721.1/60032 |
work_keys_str_mv | AT ngtim improvedmorphologicaldecompositionforarabicbroadcastnewstranscription AT nguyenkham improvedmorphologicaldecompositionforarabicbroadcastnewstranscription AT zbibrabihm improvedmorphologicaldecompositionforarabicbroadcastnewstranscription AT nguyenlong improvedmorphologicaldecompositionforarabicbroadcastnewstranscription |