COVID-19 transmission and infection: linkage of COVID-19 Infection Survey, Test and Trace, and Patient Demographics Survey.

Objectives Data linkage was conducted between the Office for National Statistics’ Covid Infection Survey (CIS), the Department of Health and Social Care’s Test and Trace (T&T) and NHS’ Personal Demographics Service (PDS) datasets. Linked data was required to provide reliable estimates of rates o...

Full description

Bibliographic Details
Main Authors: Leah Maizey, Josie Plachta, Holly Clarke, Sarah Collyer, Sarah Cummins, Gabriela De La Serna, Shelley Gammon, Ben Harries, Elizabeth Pereira, Caroline Youell
Format: Article
Language:English
Published: Swansea University 2022-08-01
Series:International Journal of Population Data Science
Subjects:
Online Access:https://ijpds.org/article/view/2094
_version_ 1797421693755457536
author Leah Maizey
Josie Plachta
Holly Clarke
Sarah Collyer
Sarah Cummins
Gabriela De La Serna
Shelley Gammon
Ben Harries
Elizabeth Pereira
Caroline Youell
author_facet Leah Maizey
Josie Plachta
Holly Clarke
Sarah Collyer
Sarah Cummins
Gabriela De La Serna
Shelley Gammon
Ben Harries
Elizabeth Pereira
Caroline Youell
author_sort Leah Maizey
collection DOAJ
description Objectives Data linkage was conducted between the Office for National Statistics’ Covid Infection Survey (CIS), the Department of Health and Social Care’s Test and Trace (T&T) and NHS’ Personal Demographics Service (PDS) datasets. Linked data was required to provide reliable estimates of rates of COVID-19 transmission and infection used to inform policy regarding the ongoing pandemic. Approach The CIS was created to track infection rates in the UK population. Linking CIS participants to positive tests in T&T helped improve these estimates. Linkage to PDS was required to attach NHS number to these datasets to facilitate further linkages that could also be used to inform Government about the spread of the virus. Multiple approaches were used to link the data. Initially, T&T was linked to itself via a series of strict matchkeys to cluster records belonging to the same individual, to create a person level identifier. Subsequent linkage of CIS-PDS, T&T-PDS and CIS-T&T involved deterministic linkages with matchkeys designed and applied independently. A probabilistic (Fellegi-Sunter scoring) method was used to link CIS-PDS and CIS-T&T. Additional, associative links were created between CIS and T&T records that had matched to the same PDS record but had not matched to each other. Results The accuracy of CIS-PDS and CIS-T&T linkages was high (recall and precision >98%; all 95% lower confidence intervals >93%). A quality assessment of T&T-PDS is underway, as are relevant bias analyses. Conclusion As a result of this linkage, COVID-19 analysts have access to enriched datasets linked to compare previously separated variables, with confidence that the linkage method used was to required quality standards. The linked data has been used to provide crucial evidence to Government on infection and re-infection rates. Subsequent linkages have enabled analysts to explore risk factors associated with different variants of the virus, vaccination status and hospital episodes. Improvements continue to be made.
first_indexed 2024-03-09T07:21:08Z
format Article
id doaj.art-a5513f038d794e70be993bdecae2339d
institution Directory Open Access Journal
issn 2399-4908
language English
last_indexed 2024-03-09T07:21:08Z
publishDate 2022-08-01
publisher Swansea University
record_format Article
series International Journal of Population Data Science
spelling doaj.art-a5513f038d794e70be993bdecae2339d2023-12-03T07:29:45ZengSwansea UniversityInternational Journal of Population Data Science2399-49082022-08-017310.23889/ijpds.v7i3.2094COVID-19 transmission and infection: linkage of COVID-19 Infection Survey, Test and Trace, and Patient Demographics Survey.Leah Maizey0Josie Plachta1Holly Clarke2Sarah Collyer3Sarah Cummins4Gabriela De La Serna5Shelley Gammon6Ben Harries7Elizabeth Pereira8Caroline Youell9The Office for National StatisticsThe Office for National StatisticsThe Office for National StatisticsThe Office for National StatisticsThe Office for National StatisticsThe Office for National StatisticsThe Office for National StatisticsThe Office for National StatisticsThe Office for National StatisticsThe Office for National StatisticsObjectives Data linkage was conducted between the Office for National Statistics’ Covid Infection Survey (CIS), the Department of Health and Social Care’s Test and Trace (T&T) and NHS’ Personal Demographics Service (PDS) datasets. Linked data was required to provide reliable estimates of rates of COVID-19 transmission and infection used to inform policy regarding the ongoing pandemic. Approach The CIS was created to track infection rates in the UK population. Linking CIS participants to positive tests in T&T helped improve these estimates. Linkage to PDS was required to attach NHS number to these datasets to facilitate further linkages that could also be used to inform Government about the spread of the virus. Multiple approaches were used to link the data. Initially, T&T was linked to itself via a series of strict matchkeys to cluster records belonging to the same individual, to create a person level identifier. Subsequent linkage of CIS-PDS, T&T-PDS and CIS-T&T involved deterministic linkages with matchkeys designed and applied independently. A probabilistic (Fellegi-Sunter scoring) method was used to link CIS-PDS and CIS-T&T. Additional, associative links were created between CIS and T&T records that had matched to the same PDS record but had not matched to each other. Results The accuracy of CIS-PDS and CIS-T&T linkages was high (recall and precision >98%; all 95% lower confidence intervals >93%). A quality assessment of T&T-PDS is underway, as are relevant bias analyses. Conclusion As a result of this linkage, COVID-19 analysts have access to enriched datasets linked to compare previously separated variables, with confidence that the linkage method used was to required quality standards. The linked data has been used to provide crucial evidence to Government on infection and re-infection rates. Subsequent linkages have enabled analysts to explore risk factors associated with different variants of the virus, vaccination status and hospital episodes. Improvements continue to be made. https://ijpds.org/article/view/2094COVID-19Covid Infection SurveyTest and TracePersonal Demographics ServiceDeterministic linkageProbabilistic linkage
spellingShingle Leah Maizey
Josie Plachta
Holly Clarke
Sarah Collyer
Sarah Cummins
Gabriela De La Serna
Shelley Gammon
Ben Harries
Elizabeth Pereira
Caroline Youell
COVID-19 transmission and infection: linkage of COVID-19 Infection Survey, Test and Trace, and Patient Demographics Survey.
International Journal of Population Data Science
COVID-19
Covid Infection Survey
Test and Trace
Personal Demographics Service
Deterministic linkage
Probabilistic linkage
title COVID-19 transmission and infection: linkage of COVID-19 Infection Survey, Test and Trace, and Patient Demographics Survey.
title_full COVID-19 transmission and infection: linkage of COVID-19 Infection Survey, Test and Trace, and Patient Demographics Survey.
title_fullStr COVID-19 transmission and infection: linkage of COVID-19 Infection Survey, Test and Trace, and Patient Demographics Survey.
title_full_unstemmed COVID-19 transmission and infection: linkage of COVID-19 Infection Survey, Test and Trace, and Patient Demographics Survey.
title_short COVID-19 transmission and infection: linkage of COVID-19 Infection Survey, Test and Trace, and Patient Demographics Survey.
title_sort covid 19 transmission and infection linkage of covid 19 infection survey test and trace and patient demographics survey
topic COVID-19
Covid Infection Survey
Test and Trace
Personal Demographics Service
Deterministic linkage
Probabilistic linkage
url https://ijpds.org/article/view/2094
work_keys_str_mv AT leahmaizey covid19transmissionandinfectionlinkageofcovid19infectionsurveytestandtraceandpatientdemographicssurvey
AT josieplachta covid19transmissionandinfectionlinkageofcovid19infectionsurveytestandtraceandpatientdemographicssurvey
AT hollyclarke covid19transmissionandinfectionlinkageofcovid19infectionsurveytestandtraceandpatientdemographicssurvey
AT sarahcollyer covid19transmissionandinfectionlinkageofcovid19infectionsurveytestandtraceandpatientdemographicssurvey
AT sarahcummins covid19transmissionandinfectionlinkageofcovid19infectionsurveytestandtraceandpatientdemographicssurvey
AT gabrieladelaserna covid19transmissionandinfectionlinkageofcovid19infectionsurveytestandtraceandpatientdemographicssurvey
AT shelleygammon covid19transmissionandinfectionlinkageofcovid19infectionsurveytestandtraceandpatientdemographicssurvey
AT benharries covid19transmissionandinfectionlinkageofcovid19infectionsurveytestandtraceandpatientdemographicssurvey
AT elizabethpereira covid19transmissionandinfectionlinkageofcovid19infectionsurveytestandtraceandpatientdemographicssurvey
AT carolineyouell covid19transmissionandinfectionlinkageofcovid19infectionsurveytestandtraceandpatientdemographicssurvey