Machine Learning and Rule-based Approaches to Assertion Classification

Objectives The authors study two approaches to assertion classification. One of these approaches, Extended NegEx (ENegEx), extends the rule-based NegEx algorithm to cover alter-association assertions; the other, Statistical Assertion Classifier (StAC), presents a machine learning solution to asserti...

Full description

Bibliographic Details
Main Authors: Uzuner, Ozlem, Zhang, Xiaoran, Sibanda, Tawanda
Other Authors: Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Format: Article
Language:en_US
Published: BMJ Publishing Group 2010
Online Access:http://hdl.handle.net/1721.1/52450
https://orcid.org/0000-0001-8011-9850
_version_ 1826213513262006272
author Uzuner, Ozlem
Zhang, Xiaoran
Sibanda, Tawanda
author2 Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
author_facet Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Uzuner, Ozlem
Zhang, Xiaoran
Sibanda, Tawanda
author_sort Uzuner, Ozlem
collection MIT
description Objectives The authors study two approaches to assertion classification. One of these approaches, Extended NegEx (ENegEx), extends the rule-based NegEx algorithm to cover alter-association assertions; the other, Statistical Assertion Classifier (StAC), presents a machine learning solution to assertion classification. Design For each mention of each medical problem, both approaches determine whether the problem, as asserted by the context of that mention, is present, absent, or uncertain in the patient, or associated with someone other than the patient. The authors use these two systems to (1) extend negation and uncertainty extraction to recognition of alter-association assertions, (2) determine the contribution of lexical and syntactic context to assertion classification, and (3) test if a machine learning approach to assertion classification can be as generally applicable and useful as its rule-based counterparts. Measurements The authors evaluated assertion classification approaches with precision, recall, and F-measure. Results The ENegEx algorithm is a general algorithm that can be directly applied to new corpora. Despite being based on machine learning, StAC can also be applied out-of-the-box to new corpora and achieve similar generality. Conclusion The StAC models that are developed on discharge summaries can be successfully applied to radiology reports. These models benefit the most from words found in the ± 4 word window of the target and can outperform ENegEx.
first_indexed 2024-09-23T15:50:26Z
format Article
id mit-1721.1/52450
institution Massachusetts Institute of Technology
language en_US
last_indexed 2024-09-23T15:50:26Z
publishDate 2010
publisher BMJ Publishing Group
record_format dspace
spelling mit-1721.1/524502022-09-29T16:29:22Z Machine Learning and Rule-based Approaches to Assertion Classification Uzuner, Ozlem Zhang, Xiaoran Sibanda, Tawanda Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Uzuner, Ozlem Uzuner, Ozlem Zhang, Xiaoran Sibanda, Tawanda Objectives The authors study two approaches to assertion classification. One of these approaches, Extended NegEx (ENegEx), extends the rule-based NegEx algorithm to cover alter-association assertions; the other, Statistical Assertion Classifier (StAC), presents a machine learning solution to assertion classification. Design For each mention of each medical problem, both approaches determine whether the problem, as asserted by the context of that mention, is present, absent, or uncertain in the patient, or associated with someone other than the patient. The authors use these two systems to (1) extend negation and uncertainty extraction to recognition of alter-association assertions, (2) determine the contribution of lexical and syntactic context to assertion classification, and (3) test if a machine learning approach to assertion classification can be as generally applicable and useful as its rule-based counterparts. Measurements The authors evaluated assertion classification approaches with precision, recall, and F-measure. Results The ENegEx algorithm is a general algorithm that can be directly applied to new corpora. Despite being based on machine learning, StAC can also be applied out-of-the-box to new corpora and achieve similar generality. Conclusion The StAC models that are developed on discharge summaries can be successfully applied to radiology reports. These models benefit the most from words found in the ± 4 word window of the target and can outperform ENegEx. 2010-03-09T21:43:46Z 2010-03-09T21:43:46Z 2009 2008-08 Article http://purl.org/eprint/type/JournalArticle 1527-974X http://hdl.handle.net/1721.1/52450 Uzuner, Özlem, Xiaoran Zhang, and Tawanda Sibanda. “Machine Learning and Rule-based Approaches to Assertion Classification.” Journal of the American Medical Informatics Association 16.1 (2009): 109-115. © 2009, British Medical Journal Publishing Group 18952931 https://orcid.org/0000-0001-8011-9850 en_US http://dx.doi.org/10.1197/jamia.M2950 Journal of the American Medical Informatics Association Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use. application/pdf BMJ Publishing Group BMJ Publishing Group
spellingShingle Uzuner, Ozlem
Zhang, Xiaoran
Sibanda, Tawanda
Machine Learning and Rule-based Approaches to Assertion Classification
title Machine Learning and Rule-based Approaches to Assertion Classification
title_full Machine Learning and Rule-based Approaches to Assertion Classification
title_fullStr Machine Learning and Rule-based Approaches to Assertion Classification
title_full_unstemmed Machine Learning and Rule-based Approaches to Assertion Classification
title_short Machine Learning and Rule-based Approaches to Assertion Classification
title_sort machine learning and rule based approaches to assertion classification
url http://hdl.handle.net/1721.1/52450
https://orcid.org/0000-0001-8011-9850
work_keys_str_mv AT uzunerozlem machinelearningandrulebasedapproachestoassertionclassification
AT zhangxiaoran machinelearningandrulebasedapproachestoassertionclassification
AT sibandatawanda machinelearningandrulebasedapproachestoassertionclassification