Improved morphological decomposition for Arabic broadcast news transcription

In this paper, we show the progress for Arabic speech recognition by incorporating contextual information into the process of morphological decomposition. The new approach achieves lower out-of-vocabulary and word error rates when compared to our previous work, in which the morphological decompositi...

Full description

Bibliographic Details
Main Authors: Ng, Tim, Nguyen, Kham, Zbib, Rabih M., Nguyen, Long
Other Authors: Massachusetts Institute of Technology. Department of Civil and Environmental Engineering
Format: Article
Language:en_US
Published: Institute of Electrical and Electronics Engineers 2010
Online Access:http://hdl.handle.net/1721.1/60032
Description
Summary:In this paper, we show the progress for Arabic speech recognition by incorporating contextual information into the process of morphological decomposition. The new approach achieves lower out-of-vocabulary and word error rates when compared to our previous work, in which the morphological decomposition relies on word-level information only. We also describe how the vocalization procedure is improved to produce pronunciations for some dialect Arabic words. By using the new approach, we reduced the word error by 0.8% absolute (4.7% relative) when compared to the baseline approach.