How much can part-of-speech tagging help parsing?

Folk wisdom holds that incorporating a part-of-speech tagger into a system that performs deep linguistic analysis will improve the speed and accuracy of the system. Previous studies of tagging have tested this belief by incorporating an existing tagger into a parsing system and observing the effect...

Full description

Bibliographic Details
Main Author:	Dalrymple, M
Format:	Journal article
Language:	English
Published:	2006

_version_	1826258958106492928
author	Dalrymple, M
author_facet	Dalrymple, M
author_sort	Dalrymple, M
collection	OXFORD
description	Folk wisdom holds that incorporating a part-of-speech tagger into a system that performs deep linguistic analysis will improve the speed and accuracy of the system. Previous studies of tagging have tested this belief by incorporating an existing tagger into a parsing system and observing the effect on the speed of the parser and accuracy of the results. However, not much work has been done to determine in a fine-grained manner exactly how much tagging can help to disambiguate or reduce ambiguity in parser output. We take a new approach to this issue by examining the full parse-forest output of a large-scale LFG-based English grammar (Riezler et al. (2002)) running on the XLE grammar development platform (Maxwell and Kaplan (1993); Maxwell and Kaplan (1996)); and partitioning the parse outputs into equivalence classes based on the tag sequences for each parse. If we find a large number of tag-sequence equivalence classes for each sentence, we can conclude that different parses tend to be distinguished by their tags; a small number means that tagging would not help much in reducing ambiguity. In this way, we can determine how much tagging would help us in the best case, if we had the "perfect tagger" to give us the correct tag sequence for each sentence. We show that if a perfect tagger were available, a reduction in ambiguity of about 50% would be available. Somewhat surprisingly, about 30% of the sentences in the corpus that was examined would not be disambiguated, even by the perfect tagger, since all of the parses for these sentences shared the same tag sequence. Our study also helps to inform research on tagging by providing a targeted determination of exactly which tags can help the most in disambiguation. © 2006 Cambridge University Press.
first_indexed	2024-03-06T18:42:17Z
format	Journal article
id	oxford-uuid:0d4d8763-5392-4a1b-aa3b-9f8c5fdfe285
institution	University of Oxford
language	English
last_indexed	2024-03-06T18:42:17Z
publishDate	2006
record_format	dspace
spelling	oxford-uuid:0d4d8763-5392-4a1b-aa3b-9f8c5fdfe2852022-03-26T09:39:49ZHow much can part-of-speech tagging help parsing?Journal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:0d4d8763-5392-4a1b-aa3b-9f8c5fdfe285EnglishSymplectic Elements at Oxford2006Dalrymple, MFolk wisdom holds that incorporating a part-of-speech tagger into a system that performs deep linguistic analysis will improve the speed and accuracy of the system. Previous studies of tagging have tested this belief by incorporating an existing tagger into a parsing system and observing the effect on the speed of the parser and accuracy of the results. However, not much work has been done to determine in a fine-grained manner exactly how much tagging can help to disambiguate or reduce ambiguity in parser output. We take a new approach to this issue by examining the full parse-forest output of a large-scale LFG-based English grammar (Riezler et al. (2002)) running on the XLE grammar development platform (Maxwell and Kaplan (1993); Maxwell and Kaplan (1996)); and partitioning the parse outputs into equivalence classes based on the tag sequences for each parse. If we find a large number of tag-sequence equivalence classes for each sentence, we can conclude that different parses tend to be distinguished by their tags; a small number means that tagging would not help much in reducing ambiguity. In this way, we can determine how much tagging would help us in the best case, if we had the "perfect tagger" to give us the correct tag sequence for each sentence. We show that if a perfect tagger were available, a reduction in ambiguity of about 50% would be available. Somewhat surprisingly, about 30% of the sentences in the corpus that was examined would not be disambiguated, even by the perfect tagger, since all of the parses for these sentences shared the same tag sequence. Our study also helps to inform research on tagging by providing a targeted determination of exactly which tags can help the most in disambiguation. © 2006 Cambridge University Press.
spellingShingle	Dalrymple, M How much can part-of-speech tagging help parsing?
title	How much can part-of-speech tagging help parsing?
title_full	How much can part-of-speech tagging help parsing?
title_fullStr	How much can part-of-speech tagging help parsing?
title_full_unstemmed	How much can part-of-speech tagging help parsing?
title_short	How much can part-of-speech tagging help parsing?
title_sort	how much can part of speech tagging help parsing
work_keys_str_mv	AT dalrymplem howmuchcanpartofspeechtagginghelpparsing

How much can part-of-speech tagging help parsing?

Similar Items