Development of Query Strategies to Identify a Histologic Lymphoma Subtype in a Large Linked Database System

Background: Large linked databases (LLDB) represent a novel resource for cancer outcomes research. However, accurate means of identifying a patient population of interest within these LLDBs can be challenging. Our research group developed a fully integrated platform that provides a means of combinin...

Full description

Bibliographic Details
Main Authors: Christopher R. Flowers, Leroy Hill, Ashley Hilliard, Susan G. Moore, Rochelle Victor, Michael Graiser, Michael S. Keehan
Format: Article
Language:English
Published: SAGE Publishing 2007-01-01
Series:Cancer Informatics
Subjects:
Online Access:http://la-press.com/article.php?article_id=236
_version_ 1818271668194246656
author Christopher R. Flowers
Leroy Hill
Ashley Hilliard
Susan G. Moore
Rochelle Victor
Michael Graiser
Michael S. Keehan
author_facet Christopher R. Flowers
Leroy Hill
Ashley Hilliard
Susan G. Moore
Rochelle Victor
Michael Graiser
Michael S. Keehan
author_sort Christopher R. Flowers
collection DOAJ
description Background: Large linked databases (LLDB) represent a novel resource for cancer outcomes research. However, accurate means of identifying a patient population of interest within these LLDBs can be challenging. Our research group developed a fully integrated platform that provides a means of combining independent legacy databases into a single cancer-focused LLDB system. We compared the sensitivity and specifi city of several SQL-based query strategies for identifying a histologic lymphoma subtype in this LLDB to determine the most accurate legacy data source for identifying a specifi c cancer patient population.Methods: Query strategies were developed to identify patients with follicular lymphoma from a LLDB of cancer registry data, electronic medical records (EMR), laboratory, administrative, pharmacy, and other clinical data. Queries were performed using common diagnostic codes (ICD-9), cancer registry histology codes (ICD-O), and text searches of EMRs. We reviewed medical records and pathology reports to confirm each diagnosis and calculated the sensitivity and specificity for each query strategy.Results: Together the queries identified 1538 potential cases of follicular lymphoma. Review of pathology and other medical reports confirmed 415 cases of follicular lymphoma, 300 pathology-verifi ed and 115 verified from other medical reports. The query using ICD-O codes was highly specific (96%). Queries using text strings varied in sensitivity (range 7–92%) and specifi city (range 86–99%). Queries using ICD-9 codes were both less sensitive (34–44%) and specific (35–87%).Conclusions: Queries of linked-cancer databases that include cancer registry data should utilize ICD-O codes or employ structured free-text searches to identify patient populations with a precise histologic diagnosis.Abbreviations: LLDB: Large Linked Database; SEER: Surveillance Epidemiology and End Results; EMR: Electronic Medical Record; ICD-9: International Classifi cation of Diseases (9th revision); ICD-O: International Classifi cation of Diseases for Oncology; AP: Anatomical Pathology; WHO: World Health Organization.
first_indexed 2024-12-12T21:29:50Z
format Article
id doaj.art-b9648148f7774049b5b88c6e73fa6f25
institution Directory Open Access Journal
issn 1176-9351
language English
last_indexed 2024-12-12T21:29:50Z
publishDate 2007-01-01
publisher SAGE Publishing
record_format Article
series Cancer Informatics
spelling doaj.art-b9648148f7774049b5b88c6e73fa6f252022-12-22T00:11:21ZengSAGE PublishingCancer Informatics1176-93512007-01-013149158Development of Query Strategies to Identify a Histologic Lymphoma Subtype in a Large Linked Database SystemChristopher R. FlowersLeroy HillAshley HilliardSusan G. MooreRochelle VictorMichael GraiserMichael S. KeehanBackground: Large linked databases (LLDB) represent a novel resource for cancer outcomes research. However, accurate means of identifying a patient population of interest within these LLDBs can be challenging. Our research group developed a fully integrated platform that provides a means of combining independent legacy databases into a single cancer-focused LLDB system. We compared the sensitivity and specifi city of several SQL-based query strategies for identifying a histologic lymphoma subtype in this LLDB to determine the most accurate legacy data source for identifying a specifi c cancer patient population.Methods: Query strategies were developed to identify patients with follicular lymphoma from a LLDB of cancer registry data, electronic medical records (EMR), laboratory, administrative, pharmacy, and other clinical data. Queries were performed using common diagnostic codes (ICD-9), cancer registry histology codes (ICD-O), and text searches of EMRs. We reviewed medical records and pathology reports to confirm each diagnosis and calculated the sensitivity and specificity for each query strategy.Results: Together the queries identified 1538 potential cases of follicular lymphoma. Review of pathology and other medical reports confirmed 415 cases of follicular lymphoma, 300 pathology-verifi ed and 115 verified from other medical reports. The query using ICD-O codes was highly specific (96%). Queries using text strings varied in sensitivity (range 7–92%) and specifi city (range 86–99%). Queries using ICD-9 codes were both less sensitive (34–44%) and specific (35–87%).Conclusions: Queries of linked-cancer databases that include cancer registry data should utilize ICD-O codes or employ structured free-text searches to identify patient populations with a precise histologic diagnosis.Abbreviations: LLDB: Large Linked Database; SEER: Surveillance Epidemiology and End Results; EMR: Electronic Medical Record; ICD-9: International Classifi cation of Diseases (9th revision); ICD-O: International Classifi cation of Diseases for Oncology; AP: Anatomical Pathology; WHO: World Health Organization.http://la-press.com/article.php?article_id=236Large linked databasecancer outcomes researchcancer epidemiologycancer registry
spellingShingle Christopher R. Flowers
Leroy Hill
Ashley Hilliard
Susan G. Moore
Rochelle Victor
Michael Graiser
Michael S. Keehan
Development of Query Strategies to Identify a Histologic Lymphoma Subtype in a Large Linked Database System
Cancer Informatics
Large linked database
cancer outcomes research
cancer epidemiology
cancer registry
title Development of Query Strategies to Identify a Histologic Lymphoma Subtype in a Large Linked Database System
title_full Development of Query Strategies to Identify a Histologic Lymphoma Subtype in a Large Linked Database System
title_fullStr Development of Query Strategies to Identify a Histologic Lymphoma Subtype in a Large Linked Database System
title_full_unstemmed Development of Query Strategies to Identify a Histologic Lymphoma Subtype in a Large Linked Database System
title_short Development of Query Strategies to Identify a Histologic Lymphoma Subtype in a Large Linked Database System
title_sort development of query strategies to identify a histologic lymphoma subtype in a large linked database system
topic Large linked database
cancer outcomes research
cancer epidemiology
cancer registry
url http://la-press.com/article.php?article_id=236
work_keys_str_mv AT christopherrflowers developmentofquerystrategiestoidentifyahistologiclymphomasubtypeinalargelinkeddatabasesystem
AT leroyhill developmentofquerystrategiestoidentifyahistologiclymphomasubtypeinalargelinkeddatabasesystem
AT ashleyhilliard developmentofquerystrategiestoidentifyahistologiclymphomasubtypeinalargelinkeddatabasesystem
AT susangmoore developmentofquerystrategiestoidentifyahistologiclymphomasubtypeinalargelinkeddatabasesystem
AT rochellevictor developmentofquerystrategiestoidentifyahistologiclymphomasubtypeinalargelinkeddatabasesystem
AT michaelgraiser developmentofquerystrategiestoidentifyahistologiclymphomasubtypeinalargelinkeddatabasesystem
AT michaelskeehan developmentofquerystrategiestoidentifyahistologiclymphomasubtypeinalargelinkeddatabasesystem