Precision screening for familial hypercholesterolaemia: a machine learning study applied to electronic health encounter data

Summary: Background: Cardiovascular outcomes for people with familial hypercholesterolaemia can be improved with diagnosis and medical management. However, 90% of individuals with familial hypercholesterolaemia remain undiagnosed in the USA. We aimed to accelerate early diagnosis and timely interve...

Full description

Bibliographic Details
Main Authors: Kelly D Myers, BS, Joshua W Knowles, MD, David Staszak, PhD, Michael D Shapiro, DO, William Howard, PhD, Mrinal Yadava, MD, David Zuzick, MBA, Latoya Williamson, MS, Nigam H Shah, PhD, Juan M Banda, PhD, Joe Leader, BS, William C Cromwell, MD, Ed Trautman, PhD, Michael F Murray, ProfMD, Seth J Baum, MD, Seth Myers, PhD, Samuel S Gidding, MD, Katherine Wilemon, BS, Daniel J Rader, ProfMD
Format: Article
Language:English
Published: Elsevier 2019-12-01
Series:The Lancet: Digital Health
Online Access:http://www.sciencedirect.com/science/article/pii/S2589750019301505
_version_ 1818580912948903936
author Kelly D Myers, BS
Joshua W Knowles, MD
David Staszak, PhD
Michael D Shapiro, DO
William Howard, PhD
Mrinal Yadava, MD
David Zuzick, MBA
Latoya Williamson, MS
Nigam H Shah, PhD
Juan M Banda, PhD
Joe Leader, BS
William C Cromwell, MD
Ed Trautman, PhD
Michael F Murray, ProfMD
Seth J Baum, MD
Seth Myers, PhD
Samuel S Gidding, MD
Katherine Wilemon, BS
Daniel J Rader, ProfMD
author_facet Kelly D Myers, BS
Joshua W Knowles, MD
David Staszak, PhD
Michael D Shapiro, DO
William Howard, PhD
Mrinal Yadava, MD
David Zuzick, MBA
Latoya Williamson, MS
Nigam H Shah, PhD
Juan M Banda, PhD
Joe Leader, BS
William C Cromwell, MD
Ed Trautman, PhD
Michael F Murray, ProfMD
Seth J Baum, MD
Seth Myers, PhD
Samuel S Gidding, MD
Katherine Wilemon, BS
Daniel J Rader, ProfMD
author_sort Kelly D Myers, BS
collection DOAJ
description Summary: Background: Cardiovascular outcomes for people with familial hypercholesterolaemia can be improved with diagnosis and medical management. However, 90% of individuals with familial hypercholesterolaemia remain undiagnosed in the USA. We aimed to accelerate early diagnosis and timely intervention for more than 1·3 million undiagnosed individuals with familial hypercholesterolaemia at high risk for early heart attacks and strokes by applying machine learning to large health-care encounter datasets. Methods: We trained the FIND FH machine learning model using deidentified health-care encounter data, including procedure and diagnostic codes, prescriptions, and laboratory findings, from 939 clinically diagnosed individuals with familial hypercholesterolaemia (395 of whom had a molecular diagnosis) and 83 136 individuals presumed free of familial hypercholesterolaemia, sampled from four US institutions. The model was then applied to a national health-care encounter database (170 million individuals) and an integrated health-care delivery system dataset (174 000 individuals). Individuals used in model training and those evaluated by the model were required to have at least one cardiovascular disease risk factor (eg, hypertension, hypercholesterolaemia, or hyperlipidemia). A Health Insurance Portability and Accountability Act of 1996-compliant programme was developed to allow providers to receive identification of individuals likely to have familial hypercholesterolaemia in their practice. Findings: Using a model with a measured precision (positive predictive value) of 0·85, recall (sensitivity) of 0·45, area under the precision–recall curve of 0·55, and area under the receiver operating characteristic curve of 0·89, we flagged 1 331 759 of 170 416 201 patients in the national database and 866 of 173 733 individuals in the health-care delivery system dataset as likely to have familial hypercholesterolaemia. Familial hypercholesterolaemia experts reviewed a sample of flagged individuals (45 from the national database and 103 from the health-care delivery system dataset) and applied clinical familial hypercholesterolaemia diagnostic criteria. Of those reviewed, 87% (95% Cl 73–100) in the national database and 77% (68–86) in the health-care delivery system dataset were categorised as having a high enough clinical suspicion of familial hypercholesterolaemia to warrant guideline-based clinical evaluation and treatment. Interpretation: The FIND FH model successfully scans large, diverse, and disparate health-care encounter databases to identify individuals with familial hypercholesterolaemia. Funding: The FH Foundation funded this study. Support was received from Amgen, Sanofi, and Regeneron.
first_indexed 2024-12-16T07:25:08Z
format Article
id doaj.art-afcf56b13f3644449e6e76f03276a057
institution Directory Open Access Journal
issn 2589-7500
language English
last_indexed 2024-12-16T07:25:08Z
publishDate 2019-12-01
publisher Elsevier
record_format Article
series The Lancet: Digital Health
spelling doaj.art-afcf56b13f3644449e6e76f03276a0572022-12-21T22:39:31ZengElsevierThe Lancet: Digital Health2589-75002019-12-0118e393e402Precision screening for familial hypercholesterolaemia: a machine learning study applied to electronic health encounter dataKelly D Myers, BS0Joshua W Knowles, MD1David Staszak, PhD2Michael D Shapiro, DO3William Howard, PhD4Mrinal Yadava, MD5David Zuzick, MBA6Latoya Williamson, MS7Nigam H Shah, PhD8Juan M Banda, PhD9Joe Leader, BS10William C Cromwell, MD11Ed Trautman, PhD12Michael F Murray, ProfMD13Seth J Baum, MD14Seth Myers, PhD15Samuel S Gidding, MD16Katherine Wilemon, BS17Daniel J Rader, ProfMD18The Familial Hypercholesterolemia Foundation, Pasadena, CA, USA; Atomo, Austin, TX, USA; Correspondence to: Mr Kelly D Myers, The Familial Hypercholesterolemia Foundation, Pasadena, CA 91106, USAThe Familial Hypercholesterolemia Foundation, Pasadena, CA, USA; Division of Cardiovascular Medicine and Cardiovascular Institute, Stanford University, Stanford, CA, USAAtomo, Austin, TX, USADepartment of Medicine, Center for Preventive Cardiology, Knight Cardiovascular Institute, Oregon Health & Science University, Portland, OR, USAAtomo, Austin, TX, USADepartment of Medicine, Center for Preventive Cardiology, Knight Cardiovascular Institute, Oregon Health & Science University, Portland, OR, USAThe Familial Hypercholesterolemia Foundation, Pasadena, CA, USAThe Familial Hypercholesterolemia Foundation, Pasadena, CA, USAStanford Center for Biomedical Informatics Research, Stanford University, Stanford, CA, USAStanford Center for Biomedical Informatics Research, Stanford University, Stanford, CA, USAGeisinger Health System, Danville, PA, USALipoprotein & Metabolic Disorders Institute, Raleigh, NC, USALaboratory Corporation of America Holdings, Burlington, NC, USAYale Center for Genomic Health, New Haven, CT, USADepartment of Integrated Medical Sciences, Charles E Schmidt College of Medicine, Florida Atlantic University, Boca Raton, FL, USAAtomo, Austin, TX, USAThe Familial Hypercholesterolemia Foundation, Pasadena, CA, USAThe Familial Hypercholesterolemia Foundation, Pasadena, CA, USAThe Familial Hypercholesterolemia Foundation, Pasadena, CA, USA; Department of Genetics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA, USA; Department of Medicine, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA, USA; Department of Pediatrics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA, USASummary: Background: Cardiovascular outcomes for people with familial hypercholesterolaemia can be improved with diagnosis and medical management. However, 90% of individuals with familial hypercholesterolaemia remain undiagnosed in the USA. We aimed to accelerate early diagnosis and timely intervention for more than 1·3 million undiagnosed individuals with familial hypercholesterolaemia at high risk for early heart attacks and strokes by applying machine learning to large health-care encounter datasets. Methods: We trained the FIND FH machine learning model using deidentified health-care encounter data, including procedure and diagnostic codes, prescriptions, and laboratory findings, from 939 clinically diagnosed individuals with familial hypercholesterolaemia (395 of whom had a molecular diagnosis) and 83 136 individuals presumed free of familial hypercholesterolaemia, sampled from four US institutions. The model was then applied to a national health-care encounter database (170 million individuals) and an integrated health-care delivery system dataset (174 000 individuals). Individuals used in model training and those evaluated by the model were required to have at least one cardiovascular disease risk factor (eg, hypertension, hypercholesterolaemia, or hyperlipidemia). A Health Insurance Portability and Accountability Act of 1996-compliant programme was developed to allow providers to receive identification of individuals likely to have familial hypercholesterolaemia in their practice. Findings: Using a model with a measured precision (positive predictive value) of 0·85, recall (sensitivity) of 0·45, area under the precision–recall curve of 0·55, and area under the receiver operating characteristic curve of 0·89, we flagged 1 331 759 of 170 416 201 patients in the national database and 866 of 173 733 individuals in the health-care delivery system dataset as likely to have familial hypercholesterolaemia. Familial hypercholesterolaemia experts reviewed a sample of flagged individuals (45 from the national database and 103 from the health-care delivery system dataset) and applied clinical familial hypercholesterolaemia diagnostic criteria. Of those reviewed, 87% (95% Cl 73–100) in the national database and 77% (68–86) in the health-care delivery system dataset were categorised as having a high enough clinical suspicion of familial hypercholesterolaemia to warrant guideline-based clinical evaluation and treatment. Interpretation: The FIND FH model successfully scans large, diverse, and disparate health-care encounter databases to identify individuals with familial hypercholesterolaemia. Funding: The FH Foundation funded this study. Support was received from Amgen, Sanofi, and Regeneron.http://www.sciencedirect.com/science/article/pii/S2589750019301505
spellingShingle Kelly D Myers, BS
Joshua W Knowles, MD
David Staszak, PhD
Michael D Shapiro, DO
William Howard, PhD
Mrinal Yadava, MD
David Zuzick, MBA
Latoya Williamson, MS
Nigam H Shah, PhD
Juan M Banda, PhD
Joe Leader, BS
William C Cromwell, MD
Ed Trautman, PhD
Michael F Murray, ProfMD
Seth J Baum, MD
Seth Myers, PhD
Samuel S Gidding, MD
Katherine Wilemon, BS
Daniel J Rader, ProfMD
Precision screening for familial hypercholesterolaemia: a machine learning study applied to electronic health encounter data
The Lancet: Digital Health
title Precision screening for familial hypercholesterolaemia: a machine learning study applied to electronic health encounter data
title_full Precision screening for familial hypercholesterolaemia: a machine learning study applied to electronic health encounter data
title_fullStr Precision screening for familial hypercholesterolaemia: a machine learning study applied to electronic health encounter data
title_full_unstemmed Precision screening for familial hypercholesterolaemia: a machine learning study applied to electronic health encounter data
title_short Precision screening for familial hypercholesterolaemia: a machine learning study applied to electronic health encounter data
title_sort precision screening for familial hypercholesterolaemia a machine learning study applied to electronic health encounter data
url http://www.sciencedirect.com/science/article/pii/S2589750019301505
work_keys_str_mv AT kellydmyersbs precisionscreeningforfamilialhypercholesterolaemiaamachinelearningstudyappliedtoelectronichealthencounterdata
AT joshuawknowlesmd precisionscreeningforfamilialhypercholesterolaemiaamachinelearningstudyappliedtoelectronichealthencounterdata
AT davidstaszakphd precisionscreeningforfamilialhypercholesterolaemiaamachinelearningstudyappliedtoelectronichealthencounterdata
AT michaeldshapirodo precisionscreeningforfamilialhypercholesterolaemiaamachinelearningstudyappliedtoelectronichealthencounterdata
AT williamhowardphd precisionscreeningforfamilialhypercholesterolaemiaamachinelearningstudyappliedtoelectronichealthencounterdata
AT mrinalyadavamd precisionscreeningforfamilialhypercholesterolaemiaamachinelearningstudyappliedtoelectronichealthencounterdata
AT davidzuzickmba precisionscreeningforfamilialhypercholesterolaemiaamachinelearningstudyappliedtoelectronichealthencounterdata
AT latoyawilliamsonms precisionscreeningforfamilialhypercholesterolaemiaamachinelearningstudyappliedtoelectronichealthencounterdata
AT nigamhshahphd precisionscreeningforfamilialhypercholesterolaemiaamachinelearningstudyappliedtoelectronichealthencounterdata
AT juanmbandaphd precisionscreeningforfamilialhypercholesterolaemiaamachinelearningstudyappliedtoelectronichealthencounterdata
AT joeleaderbs precisionscreeningforfamilialhypercholesterolaemiaamachinelearningstudyappliedtoelectronichealthencounterdata
AT williamccromwellmd precisionscreeningforfamilialhypercholesterolaemiaamachinelearningstudyappliedtoelectronichealthencounterdata
AT edtrautmanphd precisionscreeningforfamilialhypercholesterolaemiaamachinelearningstudyappliedtoelectronichealthencounterdata
AT michaelfmurrayprofmd precisionscreeningforfamilialhypercholesterolaemiaamachinelearningstudyappliedtoelectronichealthencounterdata
AT sethjbaummd precisionscreeningforfamilialhypercholesterolaemiaamachinelearningstudyappliedtoelectronichealthencounterdata
AT sethmyersphd precisionscreeningforfamilialhypercholesterolaemiaamachinelearningstudyappliedtoelectronichealthencounterdata
AT samuelsgiddingmd precisionscreeningforfamilialhypercholesterolaemiaamachinelearningstudyappliedtoelectronichealthencounterdata
AT katherinewilemonbs precisionscreeningforfamilialhypercholesterolaemiaamachinelearningstudyappliedtoelectronichealthencounterdata
AT danieljraderprofmd precisionscreeningforfamilialhypercholesterolaemiaamachinelearningstudyappliedtoelectronichealthencounterdata