Generating and evaluating a propensity model using textual features from electronic medical records.

BACKGROUND:Propensity score (PS) methods are commonly used to control for confounding in comparative effectiveness studies. Electronic health records (EHRs) contain much unstructured data that could be used as proxies for potential confounding factors. The goal of this study was to assess whether th...

Full description

Bibliographic Details
Main Authors: Zubair Afzal, Gwen M C Masclee, Miriam C J M Sturkenboom, Jan A Kors, Martijn J Schuemie
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2019-01-01
Series:PLoS ONE
Online Access:https://doi.org/10.1371/journal.pone.0212999
_version_ 1818916419031531520
author Zubair Afzal
Gwen M C Masclee
Miriam C J M Sturkenboom
Jan A Kors
Martijn J Schuemie
author_facet Zubair Afzal
Gwen M C Masclee
Miriam C J M Sturkenboom
Jan A Kors
Martijn J Schuemie
author_sort Zubair Afzal
collection DOAJ
description BACKGROUND:Propensity score (PS) methods are commonly used to control for confounding in comparative effectiveness studies. Electronic health records (EHRs) contain much unstructured data that could be used as proxies for potential confounding factors. The goal of this study was to assess whether the unstructured information can also be used to construct PS models that would allow to properly deal with confounding. We used an example of coxibs (Cox-2 inhibitors) vs. traditional NSAIDs and the risk of upper gastro-intestinal bleeding as example, since this association is often confounded due to channeling of coxibs to patients at higher risk of upper gastro-intestinal bleeding. METHODS:In a cohort study of new users of nonsteroidal anti-inflammatory drugs (NSAIDs) from the Dutch Integrated Primary Care Information (IPCI) database, we identified all patients who experienced an upper gastrointestinal bleeding (UGIB). We used a large-scale regularized regression to fit two PS models using all structured and unstructured information in the EHR. We calculated hazard ratios (HRs) to estimate the risk of UGIB among selective cyclo-oxygenase-2 (COX-2) inhibitor users compared to nonselective NSAID (nsNSAID) users. RESULTS:The crude hazard ratio of UGIB for COX-2 inhibitors compared to nsNSAIDs was 0.50 (95% confidence interval 0.18-1.36). Matching only on age resulted in an HR of 0.36 (0.11-1.16), and of 0.35 (0.11-1.11) when further adjusted for sex. Matching on PS only, the first model yielded an HR of 0.42 (0.13-1.38), which reduced to 0.35 (0.96-1.25) when adjusted for age and sex. The second model resulted in an HR of 0.42 (0.13-1.39), which dropped to 0.31 (0.09-1.08) after adjustment for age and sex. CONCLUSIONS:PS models can be created using unstructured information in EHRs. An incremental benefit was observed by matching on PS over traditional matching and adjustment for covariates.
first_indexed 2024-12-20T00:17:52Z
format Article
id doaj.art-575425eb79864c4ca32abaff4df001be
institution Directory Open Access Journal
issn 1932-6203
language English
last_indexed 2024-12-20T00:17:52Z
publishDate 2019-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj.art-575425eb79864c4ca32abaff4df001be2022-12-21T20:00:16ZengPublic Library of Science (PLoS)PLoS ONE1932-62032019-01-01143e021299910.1371/journal.pone.0212999Generating and evaluating a propensity model using textual features from electronic medical records.Zubair AfzalGwen M C MascleeMiriam C J M SturkenboomJan A KorsMartijn J SchuemieBACKGROUND:Propensity score (PS) methods are commonly used to control for confounding in comparative effectiveness studies. Electronic health records (EHRs) contain much unstructured data that could be used as proxies for potential confounding factors. The goal of this study was to assess whether the unstructured information can also be used to construct PS models that would allow to properly deal with confounding. We used an example of coxibs (Cox-2 inhibitors) vs. traditional NSAIDs and the risk of upper gastro-intestinal bleeding as example, since this association is often confounded due to channeling of coxibs to patients at higher risk of upper gastro-intestinal bleeding. METHODS:In a cohort study of new users of nonsteroidal anti-inflammatory drugs (NSAIDs) from the Dutch Integrated Primary Care Information (IPCI) database, we identified all patients who experienced an upper gastrointestinal bleeding (UGIB). We used a large-scale regularized regression to fit two PS models using all structured and unstructured information in the EHR. We calculated hazard ratios (HRs) to estimate the risk of UGIB among selective cyclo-oxygenase-2 (COX-2) inhibitor users compared to nonselective NSAID (nsNSAID) users. RESULTS:The crude hazard ratio of UGIB for COX-2 inhibitors compared to nsNSAIDs was 0.50 (95% confidence interval 0.18-1.36). Matching only on age resulted in an HR of 0.36 (0.11-1.16), and of 0.35 (0.11-1.11) when further adjusted for sex. Matching on PS only, the first model yielded an HR of 0.42 (0.13-1.38), which reduced to 0.35 (0.96-1.25) when adjusted for age and sex. The second model resulted in an HR of 0.42 (0.13-1.39), which dropped to 0.31 (0.09-1.08) after adjustment for age and sex. CONCLUSIONS:PS models can be created using unstructured information in EHRs. An incremental benefit was observed by matching on PS over traditional matching and adjustment for covariates.https://doi.org/10.1371/journal.pone.0212999
spellingShingle Zubair Afzal
Gwen M C Masclee
Miriam C J M Sturkenboom
Jan A Kors
Martijn J Schuemie
Generating and evaluating a propensity model using textual features from electronic medical records.
PLoS ONE
title Generating and evaluating a propensity model using textual features from electronic medical records.
title_full Generating and evaluating a propensity model using textual features from electronic medical records.
title_fullStr Generating and evaluating a propensity model using textual features from electronic medical records.
title_full_unstemmed Generating and evaluating a propensity model using textual features from electronic medical records.
title_short Generating and evaluating a propensity model using textual features from electronic medical records.
title_sort generating and evaluating a propensity model using textual features from electronic medical records
url https://doi.org/10.1371/journal.pone.0212999
work_keys_str_mv AT zubairafzal generatingandevaluatingapropensitymodelusingtextualfeaturesfromelectronicmedicalrecords
AT gwenmcmasclee generatingandevaluatingapropensitymodelusingtextualfeaturesfromelectronicmedicalrecords
AT miriamcjmsturkenboom generatingandevaluatingapropensitymodelusingtextualfeaturesfromelectronicmedicalrecords
AT janakors generatingandevaluatingapropensitymodelusingtextualfeaturesfromelectronicmedicalrecords
AT martijnjschuemie generatingandevaluatingapropensitymodelusingtextualfeaturesfromelectronicmedicalrecords