Household Matching for the 2021 Census

Matching households between the UK census and census coverage survey is an essential requisite to estimating census overcount and undercount. Census quality requirements are extremely high. In 2021, we will aim to produce outputs within a year of census day (previously this has been within 16 months...

Full description

Bibliographic Details
Main Authors: Josie Plachta, Charlie Tomlin
Format: Article
Language:English
Published: Swansea University 2019-11-01
Series:International Journal of Population Data Science
Online Access:https://ijpds.org/article/view/1246
_version_ 1827612360331755520
author Josie Plachta
Charlie Tomlin
author_facet Josie Plachta
Charlie Tomlin
author_sort Josie Plachta
collection DOAJ
description Matching households between the UK census and census coverage survey is an essential requisite to estimating census overcount and undercount. Census quality requirements are extremely high. In 2021, we will aim to produce outputs within a year of census day (previously this has been within 16 months). Clerical searching is very time-consuming, and so to meet these challenging timelines, we need to increase the automatic match-rate. Matching is done at the household as well as person level. In 2011, automatic household linkage primarily used a derived ‘head of household’ alongside other household variables but issues with assigning the same HOH to a household on the census and CCS reduced the effectiveness of the method. To combat this, we have designed a method that combines the variables from the household itself with sets of individual person data and runs deterministic match-keys to match households. Following this, associative matching was applied to find some of the remaining households. The method used the households of matched people to produce candidate household matches for clerical review, a process substantially quicker than clerical searching. Using the 2011 census as our test data where 264,882 matches were found (60% automatically), our new methods matched 94% of households through deterministic match-keys with a precision of 99.99%. An additional 3% were linked through association, 5% were sent for clerical matching, leaving only 1,600 (<0.001%) matches to be found through resource-heavy clerical searching. Using match keys made up of sets of person data in addition to household variables, and associative matching from person data, we have been able to successfully increase the number of matches made automatically on test data from Census 2011. This will decrease the resources needed for clerical matching and searching in 2021 enabling us to meet shorter timelines and maintain higher quality.
first_indexed 2024-03-09T08:22:17Z
format Article
id doaj.art-071552a8f7fd4f50a2dc639301611930
institution Directory Open Access Journal
issn 2399-4908
language English
last_indexed 2024-03-09T08:22:17Z
publishDate 2019-11-01
publisher Swansea University
record_format Article
series International Journal of Population Data Science
spelling doaj.art-071552a8f7fd4f50a2dc6393016119302023-12-02T21:40:12ZengSwansea UniversityInternational Journal of Population Data Science2399-49082019-11-014310.23889/ijpds.v4i3.1246Household Matching for the 2021 CensusJosie Plachta0Charlie Tomlin1Office for National StatisticsOffice for National StatisticsMatching households between the UK census and census coverage survey is an essential requisite to estimating census overcount and undercount. Census quality requirements are extremely high. In 2021, we will aim to produce outputs within a year of census day (previously this has been within 16 months). Clerical searching is very time-consuming, and so to meet these challenging timelines, we need to increase the automatic match-rate. Matching is done at the household as well as person level. In 2011, automatic household linkage primarily used a derived ‘head of household’ alongside other household variables but issues with assigning the same HOH to a household on the census and CCS reduced the effectiveness of the method. To combat this, we have designed a method that combines the variables from the household itself with sets of individual person data and runs deterministic match-keys to match households. Following this, associative matching was applied to find some of the remaining households. The method used the households of matched people to produce candidate household matches for clerical review, a process substantially quicker than clerical searching. Using the 2011 census as our test data where 264,882 matches were found (60% automatically), our new methods matched 94% of households through deterministic match-keys with a precision of 99.99%. An additional 3% were linked through association, 5% were sent for clerical matching, leaving only 1,600 (<0.001%) matches to be found through resource-heavy clerical searching. Using match keys made up of sets of person data in addition to household variables, and associative matching from person data, we have been able to successfully increase the number of matches made automatically on test data from Census 2011. This will decrease the resources needed for clerical matching and searching in 2021 enabling us to meet shorter timelines and maintain higher quality.https://ijpds.org/article/view/1246
spellingShingle Josie Plachta
Charlie Tomlin
Household Matching for the 2021 Census
International Journal of Population Data Science
title Household Matching for the 2021 Census
title_full Household Matching for the 2021 Census
title_fullStr Household Matching for the 2021 Census
title_full_unstemmed Household Matching for the 2021 Census
title_short Household Matching for the 2021 Census
title_sort household matching for the 2021 census
url https://ijpds.org/article/view/1246
work_keys_str_mv AT josieplachta householdmatchingforthe2021census
AT charlietomlin householdmatchingforthe2021census