Machine learning using longitudinal prescription and medical claims for the detection of non-alcoholic steatohepatitis (NASH)

Objectives To develop and evaluate machine learning models to detect patients with suspected undiagnosed non-alcoholic steatohepatitis (NASH) for diagnostic screening and clinical management.Methods In this retrospective observational non-interventional study using administrative medical claims data...

Full description

Bibliographic Details
Main Authors: Orla Doyle, Ozge Yasar, Patrick Long, Brett Harder, Hanna Marshall, Sanjay Bhasin, Suyin Lee, Mark Delegge, Stephanie Roy, Nadea Leavitt, John Rigg
Format: Article
Language:English
Published: BMJ Publishing Group 2022-02-01
Series:BMJ Health & Care Informatics
Online Access:https://informatics.bmj.com/content/29/1/e100510.full
_version_ 1797782364501311488
author Orla Doyle
Ozge Yasar
Patrick Long
Brett Harder
Hanna Marshall
Sanjay Bhasin
Suyin Lee
Mark Delegge
Stephanie Roy
Nadea Leavitt
John Rigg
author_facet Orla Doyle
Ozge Yasar
Patrick Long
Brett Harder
Hanna Marshall
Sanjay Bhasin
Suyin Lee
Mark Delegge
Stephanie Roy
Nadea Leavitt
John Rigg
author_sort Orla Doyle
collection DOAJ
description Objectives To develop and evaluate machine learning models to detect patients with suspected undiagnosed non-alcoholic steatohepatitis (NASH) for diagnostic screening and clinical management.Methods In this retrospective observational non-interventional study using administrative medical claims data from 1 463 089 patients, gradient-boosted decision trees were trained to detect patients with likely NASH from an at-risk patient population with a history of obesity, type 2 diabetes mellitus, metabolic disorder or non-alcoholic fatty liver (NAFL). Models were trained to detect likely NASH in all at-risk patients or in the subset without a prior NAFL diagnosis (at-risk non-NAFL patients). Models were trained and validated using retrospective medical claims data and assessed using area under precision recall curves and receiver operating characteristic curves (AUPRCs and AUROCs).Results The 6-month incidences of NASH in claims data were 1 per 1437 at-risk patients and 1 per 2127 at-risk non-NAFL patients . The model trained to detect NASH in all at-risk patients had an AUPRC of 0.0107 (95% CI 0.0104 to 0.0110) and an AUROC of 0.84. At 10% recall, model precision was 4.3%, which is 60× above NASH incidence. The model trained to detect NASH in the non-NAFL cohort had an AUPRC of 0.0030 (95% CI 0.0029 to 0.0031) and an AUROC of 0.78. At 10% recall, model precision was 1%, which is 20× above NASH incidence.Conclusion The low incidence of NASH in medical claims data corroborates the pattern of NASH underdiagnosis in clinical practice. Claims-based machine learning could facilitate the detection of patients with probable NASH for diagnostic testing and disease management.
first_indexed 2024-03-13T00:09:56Z
format Article
id doaj.art-b20dd3b5a65044889ee42c2709b7563a
institution Directory Open Access Journal
issn 2632-1009
language English
last_indexed 2024-03-13T00:09:56Z
publishDate 2022-02-01
publisher BMJ Publishing Group
record_format Article
series BMJ Health & Care Informatics
spelling doaj.art-b20dd3b5a65044889ee42c2709b7563a2023-07-12T15:30:07ZengBMJ Publishing GroupBMJ Health & Care Informatics2632-10092022-02-0129110.1136/bmjhci-2021-100510Machine learning using longitudinal prescription and medical claims for the detection of non-alcoholic steatohepatitis (NASH)Orla Doyle0Ozge Yasar1Patrick Long2Brett Harder3Hanna Marshall4Sanjay Bhasin5Suyin Lee6Mark Delegge7Stephanie Roy8Nadea Leavitt9John Rigg10Real World Solutions, IQVIA, London, UKReal World Solutions, IQVIA, London, UKReal World Solutions, IQVIA, Plymouth Meeting, Pennsylvania, USAReal World Solutions, IQVIA, Plymouth Meeting, Pennsylvania, USAReal World Solutions, IQVIA, Plymouth Meeting, Pennsylvania, USAReal World Solutions, IQVIA, Plymouth Meeting, Pennsylvania, USAReal World Solutions, IQVIA, Plymouth Meeting, Pennsylvania, USATherapeutic Center of Excellence, IQVIA, Durham, North Carolina, USAReal World Solutions, IQVIA, Plymouth Meeting, Pennsylvania, USAReal World Solutions, IQVIA, Plymouth Meeting, Pennsylvania, USAReal World Solutions, IQVIA, London, UKObjectives To develop and evaluate machine learning models to detect patients with suspected undiagnosed non-alcoholic steatohepatitis (NASH) for diagnostic screening and clinical management.Methods In this retrospective observational non-interventional study using administrative medical claims data from 1 463 089 patients, gradient-boosted decision trees were trained to detect patients with likely NASH from an at-risk patient population with a history of obesity, type 2 diabetes mellitus, metabolic disorder or non-alcoholic fatty liver (NAFL). Models were trained to detect likely NASH in all at-risk patients or in the subset without a prior NAFL diagnosis (at-risk non-NAFL patients). Models were trained and validated using retrospective medical claims data and assessed using area under precision recall curves and receiver operating characteristic curves (AUPRCs and AUROCs).Results The 6-month incidences of NASH in claims data were 1 per 1437 at-risk patients and 1 per 2127 at-risk non-NAFL patients . The model trained to detect NASH in all at-risk patients had an AUPRC of 0.0107 (95% CI 0.0104 to 0.0110) and an AUROC of 0.84. At 10% recall, model precision was 4.3%, which is 60× above NASH incidence. The model trained to detect NASH in the non-NAFL cohort had an AUPRC of 0.0030 (95% CI 0.0029 to 0.0031) and an AUROC of 0.78. At 10% recall, model precision was 1%, which is 20× above NASH incidence.Conclusion The low incidence of NASH in medical claims data corroborates the pattern of NASH underdiagnosis in clinical practice. Claims-based machine learning could facilitate the detection of patients with probable NASH for diagnostic testing and disease management.https://informatics.bmj.com/content/29/1/e100510.full
spellingShingle Orla Doyle
Ozge Yasar
Patrick Long
Brett Harder
Hanna Marshall
Sanjay Bhasin
Suyin Lee
Mark Delegge
Stephanie Roy
Nadea Leavitt
John Rigg
Machine learning using longitudinal prescription and medical claims for the detection of non-alcoholic steatohepatitis (NASH)
BMJ Health & Care Informatics
title Machine learning using longitudinal prescription and medical claims for the detection of non-alcoholic steatohepatitis (NASH)
title_full Machine learning using longitudinal prescription and medical claims for the detection of non-alcoholic steatohepatitis (NASH)
title_fullStr Machine learning using longitudinal prescription and medical claims for the detection of non-alcoholic steatohepatitis (NASH)
title_full_unstemmed Machine learning using longitudinal prescription and medical claims for the detection of non-alcoholic steatohepatitis (NASH)
title_short Machine learning using longitudinal prescription and medical claims for the detection of non-alcoholic steatohepatitis (NASH)
title_sort machine learning using longitudinal prescription and medical claims for the detection of non alcoholic steatohepatitis nash
url https://informatics.bmj.com/content/29/1/e100510.full
work_keys_str_mv AT orladoyle machinelearningusinglongitudinalprescriptionandmedicalclaimsforthedetectionofnonalcoholicsteatohepatitisnash
AT ozgeyasar machinelearningusinglongitudinalprescriptionandmedicalclaimsforthedetectionofnonalcoholicsteatohepatitisnash
AT patricklong machinelearningusinglongitudinalprescriptionandmedicalclaimsforthedetectionofnonalcoholicsteatohepatitisnash
AT brettharder machinelearningusinglongitudinalprescriptionandmedicalclaimsforthedetectionofnonalcoholicsteatohepatitisnash
AT hannamarshall machinelearningusinglongitudinalprescriptionandmedicalclaimsforthedetectionofnonalcoholicsteatohepatitisnash
AT sanjaybhasin machinelearningusinglongitudinalprescriptionandmedicalclaimsforthedetectionofnonalcoholicsteatohepatitisnash
AT suyinlee machinelearningusinglongitudinalprescriptionandmedicalclaimsforthedetectionofnonalcoholicsteatohepatitisnash
AT markdelegge machinelearningusinglongitudinalprescriptionandmedicalclaimsforthedetectionofnonalcoholicsteatohepatitisnash
AT stephanieroy machinelearningusinglongitudinalprescriptionandmedicalclaimsforthedetectionofnonalcoholicsteatohepatitisnash
AT nadealeavitt machinelearningusinglongitudinalprescriptionandmedicalclaimsforthedetectionofnonalcoholicsteatohepatitisnash
AT johnrigg machinelearningusinglongitudinalprescriptionandmedicalclaimsforthedetectionofnonalcoholicsteatohepatitisnash