Linking Data for Mothers and Babies in De-Identified Electronic Health Data.

Linkage of longitudinal administrative data for mothers and babies supports research and service evaluation in several populations around the world. We established a linked mother-baby cohort using pseudonymised, population-level data for England.Retrospective linkage study using electronic hospital...

Full description

Bibliographic Details
Main Authors: Katie Harron, Ruth Gilbert, David Cromwell, Jan van der Meulen
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2016-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC5072610?pdf=render
_version_ 1811321515134681088
author Katie Harron
Ruth Gilbert
David Cromwell
Jan van der Meulen
author_facet Katie Harron
Ruth Gilbert
David Cromwell
Jan van der Meulen
author_sort Katie Harron
collection DOAJ
description Linkage of longitudinal administrative data for mothers and babies supports research and service evaluation in several populations around the world. We established a linked mother-baby cohort using pseudonymised, population-level data for England.Retrospective linkage study using electronic hospital records of mothers and babies admitted to NHS hospitals in England, captured in Hospital Episode Statistics between April 2001 and March 2013.Of 672,955 baby records in 2012/13, 280,470 (42%) linked deterministically to a maternal record using hospital, GP practice, maternal age, birthweight, gestation, birth order and sex. A further 380,164 (56%) records linked using probabilistic methods incorporating additional variables that could differ between mother/baby records (admission dates, ethnicity, 3/4-character postcode district) or that include missing values (delivery variables). The false-match rate was estimated at 0.15% using synthetic data. Data quality improved over time: for 2001/02, 91% of baby records were linked (holding the estimated false-match rate at 0.15%). The linked cohort was representative of national distributions of gender, gestation, birth weight and maternal age, and captured approximately 97% of births in England.Probabilistic linkage of maternal and baby healthcare characteristics offers an efficient way to enrich maternity data, improve data quality, and create longitudinal cohorts for research and service evaluation. This approach could be extended to linkage of other datasets that have non-disclosive characteristics in common.
first_indexed 2024-04-13T13:19:14Z
format Article
id doaj.art-000dacc8fb90465cb772c08378dbb00f
institution Directory Open Access Journal
issn 1932-6203
language English
last_indexed 2024-04-13T13:19:14Z
publishDate 2016-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj.art-000dacc8fb90465cb772c08378dbb00f2022-12-22T02:45:22ZengPublic Library of Science (PLoS)PLoS ONE1932-62032016-01-011110e016466710.1371/journal.pone.0164667Linking Data for Mothers and Babies in De-Identified Electronic Health Data.Katie HarronRuth GilbertDavid CromwellJan van der MeulenLinkage of longitudinal administrative data for mothers and babies supports research and service evaluation in several populations around the world. We established a linked mother-baby cohort using pseudonymised, population-level data for England.Retrospective linkage study using electronic hospital records of mothers and babies admitted to NHS hospitals in England, captured in Hospital Episode Statistics between April 2001 and March 2013.Of 672,955 baby records in 2012/13, 280,470 (42%) linked deterministically to a maternal record using hospital, GP practice, maternal age, birthweight, gestation, birth order and sex. A further 380,164 (56%) records linked using probabilistic methods incorporating additional variables that could differ between mother/baby records (admission dates, ethnicity, 3/4-character postcode district) or that include missing values (delivery variables). The false-match rate was estimated at 0.15% using synthetic data. Data quality improved over time: for 2001/02, 91% of baby records were linked (holding the estimated false-match rate at 0.15%). The linked cohort was representative of national distributions of gender, gestation, birth weight and maternal age, and captured approximately 97% of births in England.Probabilistic linkage of maternal and baby healthcare characteristics offers an efficient way to enrich maternity data, improve data quality, and create longitudinal cohorts for research and service evaluation. This approach could be extended to linkage of other datasets that have non-disclosive characteristics in common.http://europepmc.org/articles/PMC5072610?pdf=render
spellingShingle Katie Harron
Ruth Gilbert
David Cromwell
Jan van der Meulen
Linking Data for Mothers and Babies in De-Identified Electronic Health Data.
PLoS ONE
title Linking Data for Mothers and Babies in De-Identified Electronic Health Data.
title_full Linking Data for Mothers and Babies in De-Identified Electronic Health Data.
title_fullStr Linking Data for Mothers and Babies in De-Identified Electronic Health Data.
title_full_unstemmed Linking Data for Mothers and Babies in De-Identified Electronic Health Data.
title_short Linking Data for Mothers and Babies in De-Identified Electronic Health Data.
title_sort linking data for mothers and babies in de identified electronic health data
url http://europepmc.org/articles/PMC5072610?pdf=render
work_keys_str_mv AT katieharron linkingdataformothersandbabiesindeidentifiedelectronichealthdata
AT ruthgilbert linkingdataformothersandbabiesindeidentifiedelectronichealthdata
AT davidcromwell linkingdataformothersandbabiesindeidentifiedelectronichealthdata
AT janvandermeulen linkingdataformothersandbabiesindeidentifiedelectronichealthdata