Language model parameter estimation using user transcriptions
In limited data domains, many effective language modeling techniques construct models with parameters to be estimated on an in-domain development set. However, in some domains, no such data exist beyond the unlabeled test corpus. In this work, we explore the iterative use of the recognition hypothes...
Những tác giả chính: | , |
---|---|
Tác giả khác: | |
Định dạng: | Bài viết |
Ngôn ngữ: | en_US |
Được phát hành: |
Institute of Electrical and Electronics Engineers
2010
|
Những chủ đề: | |
Truy cập trực tuyến: | http://hdl.handle.net/1721.1/58944 https://orcid.org/0000-0002-3097-360X |
_version_ | 1826206616257560576 |
---|---|
author | Hsu, Bo-June Glass, James R. |
author2 | Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory |
author_facet | Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Hsu, Bo-June Glass, James R. |
author_sort | Hsu, Bo-June |
collection | MIT |
description | In limited data domains, many effective language modeling techniques construct models with parameters to be estimated on an in-domain development set. However, in some domains, no such data exist beyond the unlabeled test corpus. In this work, we explore the iterative use of the recognition hypotheses for unsupervised parameter estimation. We also evaluate the effectiveness of supervised adaptation using varying amounts of user-provided transcripts of utterances selected via multiple strategies. While unsupervised adaptation obtains 80% of the potential error reductions, it is outperformed by using only 300 words of user transcription. By transcribing the lowest confidence utterances first, we further obtain an effective word error rate reduction of 0.6%. |
first_indexed | 2024-09-23T13:35:45Z |
format | Article |
id | mit-1721.1/58944 |
institution | Massachusetts Institute of Technology |
language | en_US |
last_indexed | 2024-09-23T13:35:45Z |
publishDate | 2010 |
publisher | Institute of Electrical and Electronics Engineers |
record_format | dspace |
spelling | mit-1721.1/589442022-10-01T15:54:45Z Language model parameter estimation using user transcriptions Hsu, Bo-June Glass, James R. Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Glass, James R. Hsu, Bo-June Glass, James R. adaptation language modeling speech recognition In limited data domains, many effective language modeling techniques construct models with parameters to be estimated on an in-domain development set. However, in some domains, no such data exist beyond the unlabeled test corpus. In this work, we explore the iterative use of the recognition hypotheses for unsupervised parameter estimation. We also evaluate the effectiveness of supervised adaptation using varying amounts of user-provided transcripts of utterances selected via multiple strategies. While unsupervised adaptation obtains 80% of the potential error reductions, it is outperformed by using only 300 words of user transcription. By transcribing the lowest confidence utterances first, we further obtain an effective word error rate reduction of 0.6%. T-Party Project 2010-10-07T16:43:50Z 2010-10-07T16:43:50Z 2009-05 Article http://purl.org/eprint/type/JournalArticle 978-1-4244-2353-8 1520-6149 INSPEC Accession Number: 10701485 http://hdl.handle.net/1721.1/58944 Bo-June Hsu, and J. Glass. “Language model parameter estimation using user transcriptions.” Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on. 2009. 4805-4808. © 2009 IEEE https://orcid.org/0000-0002-3097-360X en_US http://dx.doi.org/10.1109/ICASSP.2009.4960706 Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, 2009 Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use. application/pdf Institute of Electrical and Electronics Engineers IEEE |
spellingShingle | adaptation language modeling speech recognition Hsu, Bo-June Glass, James R. Language model parameter estimation using user transcriptions |
title | Language model parameter estimation using user transcriptions |
title_full | Language model parameter estimation using user transcriptions |
title_fullStr | Language model parameter estimation using user transcriptions |
title_full_unstemmed | Language model parameter estimation using user transcriptions |
title_short | Language model parameter estimation using user transcriptions |
title_sort | language model parameter estimation using user transcriptions |
topic | adaptation language modeling speech recognition |
url | http://hdl.handle.net/1721.1/58944 https://orcid.org/0000-0002-3097-360X |
work_keys_str_mv | AT hsubojune languagemodelparameterestimationusingusertranscriptions AT glassjamesr languagemodelparameterestimationusingusertranscriptions |