COVID-19 transmission and infection: linkage of COVID-19 Infection Survey, Test and Trace, and Patient Demographics Survey.
Objectives Data linkage was conducted between the Office for National Statistics’ Covid Infection Survey (CIS), the Department of Health and Social Care’s Test and Trace (T&T) and NHS’ Personal Demographics Service (PDS) datasets. Linked data was required to provide reliable estimates of rates o...
Main Authors: | , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Swansea University
2022-08-01
|
Series: | International Journal of Population Data Science |
Subjects: | |
Online Access: | https://ijpds.org/article/view/2094 |
_version_ | 1797421693755457536 |
---|---|
author | Leah Maizey Josie Plachta Holly Clarke Sarah Collyer Sarah Cummins Gabriela De La Serna Shelley Gammon Ben Harries Elizabeth Pereira Caroline Youell |
author_facet | Leah Maizey Josie Plachta Holly Clarke Sarah Collyer Sarah Cummins Gabriela De La Serna Shelley Gammon Ben Harries Elizabeth Pereira Caroline Youell |
author_sort | Leah Maizey |
collection | DOAJ |
description | Objectives
Data linkage was conducted between the Office for National Statistics’ Covid Infection Survey (CIS), the Department of Health and Social Care’s Test and Trace (T&T) and NHS’ Personal Demographics Service (PDS) datasets. Linked data was required to provide reliable estimates of rates of COVID-19 transmission and infection used to inform policy regarding the ongoing pandemic.
Approach
The CIS was created to track infection rates in the UK population. Linking CIS participants to positive tests in T&T helped improve these estimates. Linkage to PDS was required to attach NHS number to these datasets to facilitate further linkages that could also be used to inform Government about the spread of the virus. Multiple approaches were used to link the data. Initially, T&T was linked to itself via a series of strict matchkeys to cluster records belonging to the same individual, to create a person level identifier. Subsequent linkage of CIS-PDS, T&T-PDS and CIS-T&T involved deterministic linkages with matchkeys designed and applied independently. A probabilistic (Fellegi-Sunter scoring) method was used to link CIS-PDS and CIS-T&T. Additional, associative links were created between CIS and T&T records that had matched to the same PDS record but had not matched to each other.
Results
The accuracy of CIS-PDS and CIS-T&T linkages was high (recall and precision >98%; all 95% lower confidence intervals >93%). A quality assessment of T&T-PDS is underway, as are relevant bias analyses.
Conclusion
As a result of this linkage, COVID-19 analysts have access to enriched datasets linked to compare previously separated variables, with confidence that the linkage method used was to required quality standards. The linked data has been used to provide crucial evidence to Government on infection and re-infection rates. Subsequent linkages have enabled analysts to explore risk factors associated with different variants of the virus, vaccination status and hospital episodes. Improvements continue to be made.
|
first_indexed | 2024-03-09T07:21:08Z |
format | Article |
id | doaj.art-a5513f038d794e70be993bdecae2339d |
institution | Directory Open Access Journal |
issn | 2399-4908 |
language | English |
last_indexed | 2024-03-09T07:21:08Z |
publishDate | 2022-08-01 |
publisher | Swansea University |
record_format | Article |
series | International Journal of Population Data Science |
spelling | doaj.art-a5513f038d794e70be993bdecae2339d2023-12-03T07:29:45ZengSwansea UniversityInternational Journal of Population Data Science2399-49082022-08-017310.23889/ijpds.v7i3.2094COVID-19 transmission and infection: linkage of COVID-19 Infection Survey, Test and Trace, and Patient Demographics Survey.Leah Maizey0Josie Plachta1Holly Clarke2Sarah Collyer3Sarah Cummins4Gabriela De La Serna5Shelley Gammon6Ben Harries7Elizabeth Pereira8Caroline Youell9The Office for National StatisticsThe Office for National StatisticsThe Office for National StatisticsThe Office for National StatisticsThe Office for National StatisticsThe Office for National StatisticsThe Office for National StatisticsThe Office for National StatisticsThe Office for National StatisticsThe Office for National StatisticsObjectives Data linkage was conducted between the Office for National Statistics’ Covid Infection Survey (CIS), the Department of Health and Social Care’s Test and Trace (T&T) and NHS’ Personal Demographics Service (PDS) datasets. Linked data was required to provide reliable estimates of rates of COVID-19 transmission and infection used to inform policy regarding the ongoing pandemic. Approach The CIS was created to track infection rates in the UK population. Linking CIS participants to positive tests in T&T helped improve these estimates. Linkage to PDS was required to attach NHS number to these datasets to facilitate further linkages that could also be used to inform Government about the spread of the virus. Multiple approaches were used to link the data. Initially, T&T was linked to itself via a series of strict matchkeys to cluster records belonging to the same individual, to create a person level identifier. Subsequent linkage of CIS-PDS, T&T-PDS and CIS-T&T involved deterministic linkages with matchkeys designed and applied independently. A probabilistic (Fellegi-Sunter scoring) method was used to link CIS-PDS and CIS-T&T. Additional, associative links were created between CIS and T&T records that had matched to the same PDS record but had not matched to each other. Results The accuracy of CIS-PDS and CIS-T&T linkages was high (recall and precision >98%; all 95% lower confidence intervals >93%). A quality assessment of T&T-PDS is underway, as are relevant bias analyses. Conclusion As a result of this linkage, COVID-19 analysts have access to enriched datasets linked to compare previously separated variables, with confidence that the linkage method used was to required quality standards. The linked data has been used to provide crucial evidence to Government on infection and re-infection rates. Subsequent linkages have enabled analysts to explore risk factors associated with different variants of the virus, vaccination status and hospital episodes. Improvements continue to be made. https://ijpds.org/article/view/2094COVID-19Covid Infection SurveyTest and TracePersonal Demographics ServiceDeterministic linkageProbabilistic linkage |
spellingShingle | Leah Maizey Josie Plachta Holly Clarke Sarah Collyer Sarah Cummins Gabriela De La Serna Shelley Gammon Ben Harries Elizabeth Pereira Caroline Youell COVID-19 transmission and infection: linkage of COVID-19 Infection Survey, Test and Trace, and Patient Demographics Survey. International Journal of Population Data Science COVID-19 Covid Infection Survey Test and Trace Personal Demographics Service Deterministic linkage Probabilistic linkage |
title | COVID-19 transmission and infection: linkage of COVID-19 Infection Survey, Test and Trace, and Patient Demographics Survey. |
title_full | COVID-19 transmission and infection: linkage of COVID-19 Infection Survey, Test and Trace, and Patient Demographics Survey. |
title_fullStr | COVID-19 transmission and infection: linkage of COVID-19 Infection Survey, Test and Trace, and Patient Demographics Survey. |
title_full_unstemmed | COVID-19 transmission and infection: linkage of COVID-19 Infection Survey, Test and Trace, and Patient Demographics Survey. |
title_short | COVID-19 transmission and infection: linkage of COVID-19 Infection Survey, Test and Trace, and Patient Demographics Survey. |
title_sort | covid 19 transmission and infection linkage of covid 19 infection survey test and trace and patient demographics survey |
topic | COVID-19 Covid Infection Survey Test and Trace Personal Demographics Service Deterministic linkage Probabilistic linkage |
url | https://ijpds.org/article/view/2094 |
work_keys_str_mv | AT leahmaizey covid19transmissionandinfectionlinkageofcovid19infectionsurveytestandtraceandpatientdemographicssurvey AT josieplachta covid19transmissionandinfectionlinkageofcovid19infectionsurveytestandtraceandpatientdemographicssurvey AT hollyclarke covid19transmissionandinfectionlinkageofcovid19infectionsurveytestandtraceandpatientdemographicssurvey AT sarahcollyer covid19transmissionandinfectionlinkageofcovid19infectionsurveytestandtraceandpatientdemographicssurvey AT sarahcummins covid19transmissionandinfectionlinkageofcovid19infectionsurveytestandtraceandpatientdemographicssurvey AT gabrieladelaserna covid19transmissionandinfectionlinkageofcovid19infectionsurveytestandtraceandpatientdemographicssurvey AT shelleygammon covid19transmissionandinfectionlinkageofcovid19infectionsurveytestandtraceandpatientdemographicssurvey AT benharries covid19transmissionandinfectionlinkageofcovid19infectionsurveytestandtraceandpatientdemographicssurvey AT elizabethpereira covid19transmissionandinfectionlinkageofcovid19infectionsurveytestandtraceandpatientdemographicssurvey AT carolineyouell covid19transmissionandinfectionlinkageofcovid19infectionsurveytestandtraceandpatientdemographicssurvey |