Supertagging the Long Tail with Tree-Structured Decoding of Complex Categories

AbstractAlthough current CCG supertaggers achieve high accuracy on the standard WSJ test set, few systems make use of the categories’ internal structure that will drive the syntactic derivation during parsing. The tagset is traditionally truncated, discarding the many rare and comple...

Full description

Bibliographic Details
Main Authors: Jakob Prange, Nathan Schneider, Vivek Srikumar
Format: Article
Language:English
Published: The MIT Press 2021-01-01
Series:Transactions of the Association for Computational Linguistics
Online Access:https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00364/98238/Supertagging-the-Long-Tail-with-Tree-Structured
_version_ 1818251616142229504
author Jakob Prange
Nathan Schneider
Vivek Srikumar
author_facet Jakob Prange
Nathan Schneider
Vivek Srikumar
author_sort Jakob Prange
collection DOAJ
description AbstractAlthough current CCG supertaggers achieve high accuracy on the standard WSJ test set, few systems make use of the categories’ internal structure that will drive the syntactic derivation during parsing. The tagset is traditionally truncated, discarding the many rare and complex category types in the long tail. However, supertags are themselves trees. Rather than give up on rare tags, we investigate constructive models that account for their internal structure, including novel methods for tree-structured prediction. Our best tagger is capable of recovering a sizeable fraction of the long-tail supertags and even generates CCG categories that have never been seen in training, while approximating the prior state of the art in overall tag accuracy with fewer parameters. We further investigate how well different approaches generalize to out-of-domain evaluation sets.
first_indexed 2024-12-12T16:11:07Z
format Article
id doaj.art-334b9c621d174fe288dc15ab207d9bf3
institution Directory Open Access Journal
issn 2307-387X
language English
last_indexed 2024-12-12T16:11:07Z
publishDate 2021-01-01
publisher The MIT Press
record_format Article
series Transactions of the Association for Computational Linguistics
spelling doaj.art-334b9c621d174fe288dc15ab207d9bf32022-12-22T00:19:11ZengThe MIT PressTransactions of the Association for Computational Linguistics2307-387X2021-01-01924326010.1162/tacl_a_00364Supertagging the Long Tail with Tree-Structured Decoding of Complex CategoriesJakob Prange0Nathan Schneider1Vivek Srikumar2Georgetown University, United States. jp1724@georgetown.eduGeorgetown University, United States. nathan.schneider@georgetown.eduUniversity of Utah, United States. svivek@cs.utah.edu AbstractAlthough current CCG supertaggers achieve high accuracy on the standard WSJ test set, few systems make use of the categories’ internal structure that will drive the syntactic derivation during parsing. The tagset is traditionally truncated, discarding the many rare and complex category types in the long tail. However, supertags are themselves trees. Rather than give up on rare tags, we investigate constructive models that account for their internal structure, including novel methods for tree-structured prediction. Our best tagger is capable of recovering a sizeable fraction of the long-tail supertags and even generates CCG categories that have never been seen in training, while approximating the prior state of the art in overall tag accuracy with fewer parameters. We further investigate how well different approaches generalize to out-of-domain evaluation sets.https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00364/98238/Supertagging-the-Long-Tail-with-Tree-Structured
spellingShingle Jakob Prange
Nathan Schneider
Vivek Srikumar
Supertagging the Long Tail with Tree-Structured Decoding of Complex Categories
Transactions of the Association for Computational Linguistics
title Supertagging the Long Tail with Tree-Structured Decoding of Complex Categories
title_full Supertagging the Long Tail with Tree-Structured Decoding of Complex Categories
title_fullStr Supertagging the Long Tail with Tree-Structured Decoding of Complex Categories
title_full_unstemmed Supertagging the Long Tail with Tree-Structured Decoding of Complex Categories
title_short Supertagging the Long Tail with Tree-Structured Decoding of Complex Categories
title_sort supertagging the long tail with tree structured decoding of complex categories
url https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00364/98238/Supertagging-the-Long-Tail-with-Tree-Structured
work_keys_str_mv AT jakobprange supertaggingthelongtailwithtreestructureddecodingofcomplexcategories
AT nathanschneider supertaggingthelongtailwithtreestructureddecodingofcomplexcategories
AT viveksrikumar supertaggingthelongtailwithtreestructureddecodingofcomplexcategories