Extracting principal diagnosis, co-morbidity and smoking status for asthma research: evaluation of a natural language processing system

<p>Abstract</p> <p>Background</p> <p>The text descriptions in electronic medical records are a rich source of information. We have developed a Health Information Text Extraction (HITEx) tool and used it to extract key findings for a research study on airways disease.<...

Full description

Bibliographic Details
Main Authors: Sordo Margarita, Weiss Scott, Goryachev Sergey, Zeng Qing T, Murphy Shawn N, Lazarus Ross
Format: Article
Language:English
Published: BMC 2006-07-01
Series:BMC Medical Informatics and Decision Making
Online Access:http://www.biomedcentral.com/1472-6947/6/30
_version_ 1811319844759404544
author Sordo Margarita
Weiss Scott
Goryachev Sergey
Zeng Qing T
Murphy Shawn N
Lazarus Ross
author_facet Sordo Margarita
Weiss Scott
Goryachev Sergey
Zeng Qing T
Murphy Shawn N
Lazarus Ross
author_sort Sordo Margarita
collection DOAJ
description <p>Abstract</p> <p>Background</p> <p>The text descriptions in electronic medical records are a rich source of information. We have developed a Health Information Text Extraction (HITEx) tool and used it to extract key findings for a research study on airways disease.</p> <p>Methods</p> <p>The principal diagnosis, co-morbidity and smoking status extracted by HITEx from a set of 150 discharge summaries were compared to an expert-generated gold standard.</p> <p>Results</p> <p>The accuracy of HITEx was 82% for principal diagnosis, 87% for co-morbidity, and 90% for smoking status extraction, when cases labeled "Insufficient Data" by the gold standard were excluded.</p> <p>Conclusion</p> <p>We consider the results promising, given the complexity of the discharge summaries and the extraction tasks.</p>
first_indexed 2024-04-13T12:50:27Z
format Article
id doaj.art-a9ff2298177240d290173f2eb25c2318
institution Directory Open Access Journal
issn 1472-6947
language English
last_indexed 2024-04-13T12:50:27Z
publishDate 2006-07-01
publisher BMC
record_format Article
series BMC Medical Informatics and Decision Making
spelling doaj.art-a9ff2298177240d290173f2eb25c23182022-12-22T02:46:15ZengBMCBMC Medical Informatics and Decision Making1472-69472006-07-01613010.1186/1472-6947-6-30Extracting principal diagnosis, co-morbidity and smoking status for asthma research: evaluation of a natural language processing systemSordo MargaritaWeiss ScottGoryachev SergeyZeng Qing TMurphy Shawn NLazarus Ross<p>Abstract</p> <p>Background</p> <p>The text descriptions in electronic medical records are a rich source of information. We have developed a Health Information Text Extraction (HITEx) tool and used it to extract key findings for a research study on airways disease.</p> <p>Methods</p> <p>The principal diagnosis, co-morbidity and smoking status extracted by HITEx from a set of 150 discharge summaries were compared to an expert-generated gold standard.</p> <p>Results</p> <p>The accuracy of HITEx was 82% for principal diagnosis, 87% for co-morbidity, and 90% for smoking status extraction, when cases labeled "Insufficient Data" by the gold standard were excluded.</p> <p>Conclusion</p> <p>We consider the results promising, given the complexity of the discharge summaries and the extraction tasks.</p>http://www.biomedcentral.com/1472-6947/6/30
spellingShingle Sordo Margarita
Weiss Scott
Goryachev Sergey
Zeng Qing T
Murphy Shawn N
Lazarus Ross
Extracting principal diagnosis, co-morbidity and smoking status for asthma research: evaluation of a natural language processing system
BMC Medical Informatics and Decision Making
title Extracting principal diagnosis, co-morbidity and smoking status for asthma research: evaluation of a natural language processing system
title_full Extracting principal diagnosis, co-morbidity and smoking status for asthma research: evaluation of a natural language processing system
title_fullStr Extracting principal diagnosis, co-morbidity and smoking status for asthma research: evaluation of a natural language processing system
title_full_unstemmed Extracting principal diagnosis, co-morbidity and smoking status for asthma research: evaluation of a natural language processing system
title_short Extracting principal diagnosis, co-morbidity and smoking status for asthma research: evaluation of a natural language processing system
title_sort extracting principal diagnosis co morbidity and smoking status for asthma research evaluation of a natural language processing system
url http://www.biomedcentral.com/1472-6947/6/30
work_keys_str_mv AT sordomargarita extractingprincipaldiagnosiscomorbidityandsmokingstatusforasthmaresearchevaluationofanaturallanguageprocessingsystem
AT weissscott extractingprincipaldiagnosiscomorbidityandsmokingstatusforasthmaresearchevaluationofanaturallanguageprocessingsystem
AT goryachevsergey extractingprincipaldiagnosiscomorbidityandsmokingstatusforasthmaresearchevaluationofanaturallanguageprocessingsystem
AT zengqingt extractingprincipaldiagnosiscomorbidityandsmokingstatusforasthmaresearchevaluationofanaturallanguageprocessingsystem
AT murphyshawnn extractingprincipaldiagnosiscomorbidityandsmokingstatusforasthmaresearchevaluationofanaturallanguageprocessingsystem
AT lazarusross extractingprincipaldiagnosiscomorbidityandsmokingstatusforasthmaresearchevaluationofanaturallanguageprocessingsystem