Deep learning for reading and understanding language

<p>This thesis presents novel tasks and deep learning methods for machine reading comprehension and question answering with the goal of achieving natural language understanding.</p> <p>First, we consider a semantic parsing task where the model understands sentences and translates t...

Full description

Bibliographic Details
Main Author: Kočiský, T
Other Authors: Blunsom, P
Format: Thesis
Published: 2017
Subjects:
_version_ 1826296993975107584
author Kočiský, T
author2 Blunsom, P
author_facet Blunsom, P
Kočiský, T
author_sort Kočiský, T
collection OXFORD
description <p>This thesis presents novel tasks and deep learning methods for machine reading comprehension and question answering with the goal of achieving natural language understanding.</p> <p>First, we consider a semantic parsing task where the model understands sentences and translates them into a logical form or instructions. We present a novel semi-supervised sequential autoencoder that considers language as a discrete sequential latent variable and semantic parses as the observations. This model allows us to leverage synthetically generated unpaired logical forms, and thereby alleviate the lack of supervised training data. We show the semi-supervised model outperforms a supervised model when trained with the additional generated data.</p> <p>Second, reading comprehension requires integrating information and reasoning about events, entities, and their relations across a full document. Question answering is conventionally used to assess reading comprehension ability, in both artificial agents and children learning to read. We propose a new, challenging, supervised reading comprehension task. We gather a large-scale dataset of news stories from the CNN and Daily Mail websites with Cloze-style questions created from the highlights. This dataset allows for the first time training deep learning models for reading comprehension. We also introduce novel attention-based models for this task and present qualitative analysis of the attention mechanism. Finally, following the recent advances in reading comprehension in both models and task design, we further propose a new task for understanding complex narratives, NarrativeQA, consisting of full texts of books and movie scripts. We collect human written questions and answers based on high-level plot summaries. This task is designed to encourage development of models for language understanding; it is designed so that successfully answering their questions requires understanding the underlying narrative rather than relying on shallow pattern matching or salience. We show that although humans solve the tasks easily, standard reading comprehension models struggle on the tasks presented here.</p>
first_indexed 2024-03-07T04:24:51Z
format Thesis
id oxford-uuid:cc45e366-cdd8-495b-af42-dfd726700ff0
institution University of Oxford
last_indexed 2024-03-07T04:24:51Z
publishDate 2017
record_format dspace
spelling oxford-uuid:cc45e366-cdd8-495b-af42-dfd726700ff02022-03-27T07:20:41ZDeep learning for reading and understanding languageThesishttp://purl.org/coar/resource_type/c_db06uuid:cc45e366-cdd8-495b-af42-dfd726700ff0Computer ScienceComputational LinguisticsNatural Language ProcessingORA Deposit2017Kočiský, TBlunsom, P<p>This thesis presents novel tasks and deep learning methods for machine reading comprehension and question answering with the goal of achieving natural language understanding.</p> <p>First, we consider a semantic parsing task where the model understands sentences and translates them into a logical form or instructions. We present a novel semi-supervised sequential autoencoder that considers language as a discrete sequential latent variable and semantic parses as the observations. This model allows us to leverage synthetically generated unpaired logical forms, and thereby alleviate the lack of supervised training data. We show the semi-supervised model outperforms a supervised model when trained with the additional generated data.</p> <p>Second, reading comprehension requires integrating information and reasoning about events, entities, and their relations across a full document. Question answering is conventionally used to assess reading comprehension ability, in both artificial agents and children learning to read. We propose a new, challenging, supervised reading comprehension task. We gather a large-scale dataset of news stories from the CNN and Daily Mail websites with Cloze-style questions created from the highlights. This dataset allows for the first time training deep learning models for reading comprehension. We also introduce novel attention-based models for this task and present qualitative analysis of the attention mechanism. Finally, following the recent advances in reading comprehension in both models and task design, we further propose a new task for understanding complex narratives, NarrativeQA, consisting of full texts of books and movie scripts. We collect human written questions and answers based on high-level plot summaries. This task is designed to encourage development of models for language understanding; it is designed so that successfully answering their questions requires understanding the underlying narrative rather than relying on shallow pattern matching or salience. We show that although humans solve the tasks easily, standard reading comprehension models struggle on the tasks presented here.</p>
spellingShingle Computer Science
Computational Linguistics
Natural Language Processing
Kočiský, T
Deep learning for reading and understanding language
title Deep learning for reading and understanding language
title_full Deep learning for reading and understanding language
title_fullStr Deep learning for reading and understanding language
title_full_unstemmed Deep learning for reading and understanding language
title_short Deep learning for reading and understanding language
title_sort deep learning for reading and understanding language
topic Computer Science
Computational Linguistics
Natural Language Processing
work_keys_str_mv AT kociskyt deeplearningforreadingandunderstandinglanguage