Development of Query Strategies to Identify a Histologic Lymphoma Subtype in a Large Linked Database System
Background: Large linked databases (LLDB) represent a novel resource for cancer outcomes research. However, accurate means of identifying a patient population of interest within these LLDBs can be challenging. Our research group developed a fully integrated platform that provides a means of combinin...
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
SAGE Publishing
2007-01-01
|
Series: | Cancer Informatics |
Subjects: | |
Online Access: | http://la-press.com/article.php?article_id=236 |
_version_ | 1818271668194246656 |
---|---|
author | Christopher R. Flowers Leroy Hill Ashley Hilliard Susan G. Moore Rochelle Victor Michael Graiser Michael S. Keehan |
author_facet | Christopher R. Flowers Leroy Hill Ashley Hilliard Susan G. Moore Rochelle Victor Michael Graiser Michael S. Keehan |
author_sort | Christopher R. Flowers |
collection | DOAJ |
description | Background: Large linked databases (LLDB) represent a novel resource for cancer outcomes research. However, accurate means of identifying a patient population of interest within these LLDBs can be challenging. Our research group developed a fully integrated platform that provides a means of combining independent legacy databases into a single cancer-focused LLDB system. We compared the sensitivity and specifi city of several SQL-based query strategies for identifying a histologic lymphoma subtype in this LLDB to determine the most accurate legacy data source for identifying a specifi c cancer patient population.Methods: Query strategies were developed to identify patients with follicular lymphoma from a LLDB of cancer registry data, electronic medical records (EMR), laboratory, administrative, pharmacy, and other clinical data. Queries were performed using common diagnostic codes (ICD-9), cancer registry histology codes (ICD-O), and text searches of EMRs. We reviewed medical records and pathology reports to confirm each diagnosis and calculated the sensitivity and specificity for each query strategy.Results: Together the queries identified 1538 potential cases of follicular lymphoma. Review of pathology and other medical reports confirmed 415 cases of follicular lymphoma, 300 pathology-verifi ed and 115 verified from other medical reports. The query using ICD-O codes was highly specific (96%). Queries using text strings varied in sensitivity (range 7–92%) and specifi city (range 86–99%). Queries using ICD-9 codes were both less sensitive (34–44%) and specific (35–87%).Conclusions: Queries of linked-cancer databases that include cancer registry data should utilize ICD-O codes or employ structured free-text searches to identify patient populations with a precise histologic diagnosis.Abbreviations: LLDB: Large Linked Database; SEER: Surveillance Epidemiology and End Results; EMR: Electronic Medical Record; ICD-9: International Classifi cation of Diseases (9th revision); ICD-O: International Classifi cation of Diseases for Oncology; AP: Anatomical Pathology; WHO: World Health Organization. |
first_indexed | 2024-12-12T21:29:50Z |
format | Article |
id | doaj.art-b9648148f7774049b5b88c6e73fa6f25 |
institution | Directory Open Access Journal |
issn | 1176-9351 |
language | English |
last_indexed | 2024-12-12T21:29:50Z |
publishDate | 2007-01-01 |
publisher | SAGE Publishing |
record_format | Article |
series | Cancer Informatics |
spelling | doaj.art-b9648148f7774049b5b88c6e73fa6f252022-12-22T00:11:21ZengSAGE PublishingCancer Informatics1176-93512007-01-013149158Development of Query Strategies to Identify a Histologic Lymphoma Subtype in a Large Linked Database SystemChristopher R. FlowersLeroy HillAshley HilliardSusan G. MooreRochelle VictorMichael GraiserMichael S. KeehanBackground: Large linked databases (LLDB) represent a novel resource for cancer outcomes research. However, accurate means of identifying a patient population of interest within these LLDBs can be challenging. Our research group developed a fully integrated platform that provides a means of combining independent legacy databases into a single cancer-focused LLDB system. We compared the sensitivity and specifi city of several SQL-based query strategies for identifying a histologic lymphoma subtype in this LLDB to determine the most accurate legacy data source for identifying a specifi c cancer patient population.Methods: Query strategies were developed to identify patients with follicular lymphoma from a LLDB of cancer registry data, electronic medical records (EMR), laboratory, administrative, pharmacy, and other clinical data. Queries were performed using common diagnostic codes (ICD-9), cancer registry histology codes (ICD-O), and text searches of EMRs. We reviewed medical records and pathology reports to confirm each diagnosis and calculated the sensitivity and specificity for each query strategy.Results: Together the queries identified 1538 potential cases of follicular lymphoma. Review of pathology and other medical reports confirmed 415 cases of follicular lymphoma, 300 pathology-verifi ed and 115 verified from other medical reports. The query using ICD-O codes was highly specific (96%). Queries using text strings varied in sensitivity (range 7–92%) and specifi city (range 86–99%). Queries using ICD-9 codes were both less sensitive (34–44%) and specific (35–87%).Conclusions: Queries of linked-cancer databases that include cancer registry data should utilize ICD-O codes or employ structured free-text searches to identify patient populations with a precise histologic diagnosis.Abbreviations: LLDB: Large Linked Database; SEER: Surveillance Epidemiology and End Results; EMR: Electronic Medical Record; ICD-9: International Classifi cation of Diseases (9th revision); ICD-O: International Classifi cation of Diseases for Oncology; AP: Anatomical Pathology; WHO: World Health Organization.http://la-press.com/article.php?article_id=236Large linked databasecancer outcomes researchcancer epidemiologycancer registry |
spellingShingle | Christopher R. Flowers Leroy Hill Ashley Hilliard Susan G. Moore Rochelle Victor Michael Graiser Michael S. Keehan Development of Query Strategies to Identify a Histologic Lymphoma Subtype in a Large Linked Database System Cancer Informatics Large linked database cancer outcomes research cancer epidemiology cancer registry |
title | Development of Query Strategies to Identify a Histologic Lymphoma Subtype in a Large Linked Database System |
title_full | Development of Query Strategies to Identify a Histologic Lymphoma Subtype in a Large Linked Database System |
title_fullStr | Development of Query Strategies to Identify a Histologic Lymphoma Subtype in a Large Linked Database System |
title_full_unstemmed | Development of Query Strategies to Identify a Histologic Lymphoma Subtype in a Large Linked Database System |
title_short | Development of Query Strategies to Identify a Histologic Lymphoma Subtype in a Large Linked Database System |
title_sort | development of query strategies to identify a histologic lymphoma subtype in a large linked database system |
topic | Large linked database cancer outcomes research cancer epidemiology cancer registry |
url | http://la-press.com/article.php?article_id=236 |
work_keys_str_mv | AT christopherrflowers developmentofquerystrategiestoidentifyahistologiclymphomasubtypeinalargelinkeddatabasesystem AT leroyhill developmentofquerystrategiestoidentifyahistologiclymphomasubtypeinalargelinkeddatabasesystem AT ashleyhilliard developmentofquerystrategiestoidentifyahistologiclymphomasubtypeinalargelinkeddatabasesystem AT susangmoore developmentofquerystrategiestoidentifyahistologiclymphomasubtypeinalargelinkeddatabasesystem AT rochellevictor developmentofquerystrategiestoidentifyahistologiclymphomasubtypeinalargelinkeddatabasesystem AT michaelgraiser developmentofquerystrategiestoidentifyahistologiclymphomasubtypeinalargelinkeddatabasesystem AT michaelskeehan developmentofquerystrategiestoidentifyahistologiclymphomasubtypeinalargelinkeddatabasesystem |