Incorporating Content Structure into Text Analysis Applications

URL to papers listed on conference site

Bibliographic Details
Main Authors: Sauper, Christina Joan, Haghighi, Aria, Barzilay, Regina
Other Authors: Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Format: Article
Language:en_US
Published: Association for Computational Linguistics 2011
Online Access:http://hdl.handle.net/1721.1/62235
https://orcid.org/0000-0002-2921-8201
_version_ 1826212202317611008
author Sauper, Christina Joan
Haghighi, Aria
Barzilay, Regina
author2 Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
author_facet Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Sauper, Christina Joan
Haghighi, Aria
Barzilay, Regina
author_sort Sauper, Christina Joan
collection MIT
description URL to papers listed on conference site
first_indexed 2024-09-23T15:17:39Z
format Article
id mit-1721.1/62235
institution Massachusetts Institute of Technology
language en_US
last_indexed 2024-09-23T15:17:39Z
publishDate 2011
publisher Association for Computational Linguistics
record_format dspace
spelling mit-1721.1/622352022-09-29T13:59:27Z Incorporating Content Structure into Text Analysis Applications Sauper, Christina Joan Haghighi, Aria Barzilay, Regina Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Barzilay, Regina Barzilay, Regina Sauper, Christina Joan Haghighi, Aria URL to papers listed on conference site Information about the content structure of a document is largely ignored by current text analysis applications such as information extraction and sentiment analysis. This stands in contrast to the linguistic intuition that rich contextual information should benefit such applications. We present a framework which combines a supervised text analysis application with the induction of latent content structure. Both of these elements are learned jointly using the EM algorithm. The induced content structure is learned from a large unannotated corpus and biased by the underlying text analysis task. We demonstrate that exploiting content structure yields significant improvements over approaches that rely only on local context. 2011-04-19T18:21:46Z 2011-04-19T18:21:46Z 2010-10 Article http://purl.org/eprint/type/ConferencePaper http://hdl.handle.net/1721.1/62235 Sauper, Christina, Aria Haghighi, and Regina Barzilay. "Incorporating Content Structure into Text Analysis Applications." EMNLP 2010: Conference on Empirical Methods in Natural Language Processing, October 9-11, 2010, MIT, Massachusetts, USA. https://orcid.org/0000-0002-2921-8201 en_US http://www.lsi.upc.edu/events/emnlp2010/papers.html EMNLP 2010 : Conference on Empirical Methods in Natural Language Processing Creative Commons Attribution-Noncommercial-Share Alike 3.0 http://creativecommons.org/licenses/by-nc-sa/3.0/ application/pdf Association for Computational Linguistics MIT web domain
spellingShingle Sauper, Christina Joan
Haghighi, Aria
Barzilay, Regina
Incorporating Content Structure into Text Analysis Applications
title Incorporating Content Structure into Text Analysis Applications
title_full Incorporating Content Structure into Text Analysis Applications
title_fullStr Incorporating Content Structure into Text Analysis Applications
title_full_unstemmed Incorporating Content Structure into Text Analysis Applications
title_short Incorporating Content Structure into Text Analysis Applications
title_sort incorporating content structure into text analysis applications
url http://hdl.handle.net/1721.1/62235
https://orcid.org/0000-0002-2921-8201
work_keys_str_mv AT sauperchristinajoan incorporatingcontentstructureintotextanalysisapplications
AT haghighiaria incorporatingcontentstructureintotextanalysisapplications
AT barzilayregina incorporatingcontentstructureintotextanalysisapplications