A tree-to-tree model for statistical machine translation

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008.

Bibliographic Details
Main Author: Cowan, Brooke A. (Brooke Alissa), 1972-
Other Authors: Michael J. Collins.
Format: Thesis
Language:eng
Published: Massachusetts Institute of Technology 2009
Subjects:
Online Access:http://hdl.handle.net/1721.1/44689
_version_ 1826211703247863808
author Cowan, Brooke A. (Brooke Alissa), 1972-
author2 Michael J. Collins.
author_facet Michael J. Collins.
Cowan, Brooke A. (Brooke Alissa), 1972-
author_sort Cowan, Brooke A. (Brooke Alissa), 1972-
collection MIT
description Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008.
first_indexed 2024-09-23T15:10:11Z
format Thesis
id mit-1721.1/44689
institution Massachusetts Institute of Technology
language eng
last_indexed 2024-09-23T15:10:11Z
publishDate 2009
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/446892019-04-11T08:53:57Z A tree-to-tree model for statistical machine translation Cowan, Brooke A. (Brooke Alissa), 1972- Michael J. Collins. Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science. Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science. Electrical Engineering and Computer Science. Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008. Includes bibliographical references (p. 227-234). In this thesis, we take a statistical tree-to-tree approach to solving the problem of machine translation (MT). In a statistical tree-to-tree approach, first the source-language input is parsed into a syntactic tree structure; then the source-language tree is mapped to a target-language tree. This kind of approach has several advantages. For one, parsing the input generates valuable information about its meaning. In addition, the mapping from a source-language tree to a target-language tree offers a mechanism for preserving the meaning of the input. Finally, producing a target-language tree helps to ensure the grammaticality of the output. A main focus of this thesis is to develop a statistical tree-to-tree mapping algorithm. Our solution involves a novel representation called an aligned extended projection, or AEP. The AEP, inspired by ideas in linguistic theory related to tree-adjoining grammars, is a parse-tree like structure that models clause-level phenomena such as verbal argument structure and lexical word-order. The AEP also contains alignment information that links the source-language input to the target-language output. Instead of learning a mapping from a source-language tree to a target-language tree, the AEP-based approach learns a mapping from a source-language tree to a target-language AEP. The AEP is a complex structure, and learning a mapping from parse trees to AEPs presents a challenging machine learning problem. In this thesis, we use a linear structured prediction model to solve this learning problem. A human evaluation of the AEP-based translation approach in a German-to-English task shows significant improvements in the grammaticality of translations. This thesis also presents a statistical parser for Spanish that could be used as part of a Spanish/English translation system. by Brooke Alissa Cowan. Ph.D. 2009-03-16T19:29:41Z 2009-03-16T19:29:41Z 2008 2008 Thesis http://hdl.handle.net/1721.1/44689 289341212 eng M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582 234 p application/pdf Massachusetts Institute of Technology
spellingShingle Electrical Engineering and Computer Science.
Cowan, Brooke A. (Brooke Alissa), 1972-
A tree-to-tree model for statistical machine translation
title A tree-to-tree model for statistical machine translation
title_full A tree-to-tree model for statistical machine translation
title_fullStr A tree-to-tree model for statistical machine translation
title_full_unstemmed A tree-to-tree model for statistical machine translation
title_short A tree-to-tree model for statistical machine translation
title_sort tree to tree model for statistical machine translation
topic Electrical Engineering and Computer Science.
url http://hdl.handle.net/1721.1/44689
work_keys_str_mv AT cowanbrookeabrookealissa1972 atreetotreemodelforstatisticalmachinetranslation
AT cowanbrookeabrookealissa1972 treetotreemodelforstatisticalmachinetranslation