Reconciling Parent-Child Relationships across US Administrative Datasets

Introduction Population data capture children, parents, relatives, and others moving in and out of households. The U.S. has seen falling marriage rates, and increases in multigenerational households and complex families, young children living with grandparents, and adult children living with parents...

Full description

Bibliographic Details
Main Authors: Amy O'Hara, Katie Genadek, Carla Medalia, Trent Alexander
Format: Article
Language:English
Published: Swansea University 2018-09-01
Series:International Journal of Population Data Science
Online Access:https://ijpds.org/article/view/864
_version_ 1797421679554592768
author Amy O'Hara
Katie Genadek
Carla Medalia
Trent Alexander
author_facet Amy O'Hara
Katie Genadek
Carla Medalia
Trent Alexander
author_sort Amy O'Hara
collection DOAJ
description Introduction Population data capture children, parents, relatives, and others moving in and out of households. The U.S. has seen falling marriage rates, and increases in multigenerational households and complex families, young children living with grandparents, and adult children living with parents. Robust parent-child linkages are critical to understand these demographic shifts. Objectives and Approach We construct and validate parent-child linkages over a century to observe how U.S. households are changing over time. The three largest person-based datafiles in the U.S. are the decennial censuses, the Social Security Administration transaction file, and individual tax returns from the Internal Revenue Service. These sources operationalize relationships differently, capture data at various frequencies, and gather the data for unique purposes. We use probabilistic matching to observe and reconcile parent-child relationships across these sources. The data include a variety of personal identifiers including name, date of birth, parents’ names, address, and place of birth that support matching and validation. Results We find that understanding the content, consistency, and coverage of the files before matching is critical for high quality linkages. The representativeness of the parent-child relationship file improves over time, with the weakest coverage for the Greatest Generation and the strongest coverage for Millennials. Coverage varies by source: tax data underrepresent non-white children and have duplicate records for SSNs, while names and dates of birth are missing from Census data. Multiple match rates differ among demographic groups and over time. In the matching process, the blocking variables rely on common variables across the population datasets. Our approach provides robust entity resolution for women, despite married-maiden name changes. We describe challenges due to data problems in old census records and validation changes in social security data. Conclusion/Implications We conduct a successful reconciliation of parent-child relationships in U.S. population level files. The project supports operational and research uses, such as the 2020 Census. We will extend this work using graph matching and will expand the method to validate other relationship links including spouses and siblings.
first_indexed 2024-03-09T07:20:56Z
format Article
id doaj.art-6ee547cb0592446a81b710359d7bea85
institution Directory Open Access Journal
issn 2399-4908
language English
last_indexed 2024-03-09T07:20:56Z
publishDate 2018-09-01
publisher Swansea University
record_format Article
series International Journal of Population Data Science
spelling doaj.art-6ee547cb0592446a81b710359d7bea852023-12-03T07:30:54ZengSwansea UniversityInternational Journal of Population Data Science2399-49082018-09-013410.23889/ijpds.v3i4.864864Reconciling Parent-Child Relationships across US Administrative DatasetsAmy O'Hara0Katie Genadek1Carla Medalia2Trent Alexander3Stanford UniversityUS Census BureauUS Census BureauICPSR, University of MichiganIntroduction Population data capture children, parents, relatives, and others moving in and out of households. The U.S. has seen falling marriage rates, and increases in multigenerational households and complex families, young children living with grandparents, and adult children living with parents. Robust parent-child linkages are critical to understand these demographic shifts. Objectives and Approach We construct and validate parent-child linkages over a century to observe how U.S. households are changing over time. The three largest person-based datafiles in the U.S. are the decennial censuses, the Social Security Administration transaction file, and individual tax returns from the Internal Revenue Service. These sources operationalize relationships differently, capture data at various frequencies, and gather the data for unique purposes. We use probabilistic matching to observe and reconcile parent-child relationships across these sources. The data include a variety of personal identifiers including name, date of birth, parents’ names, address, and place of birth that support matching and validation. Results We find that understanding the content, consistency, and coverage of the files before matching is critical for high quality linkages. The representativeness of the parent-child relationship file improves over time, with the weakest coverage for the Greatest Generation and the strongest coverage for Millennials. Coverage varies by source: tax data underrepresent non-white children and have duplicate records for SSNs, while names and dates of birth are missing from Census data. Multiple match rates differ among demographic groups and over time. In the matching process, the blocking variables rely on common variables across the population datasets. Our approach provides robust entity resolution for women, despite married-maiden name changes. We describe challenges due to data problems in old census records and validation changes in social security data. Conclusion/Implications We conduct a successful reconciliation of parent-child relationships in U.S. population level files. The project supports operational and research uses, such as the 2020 Census. We will extend this work using graph matching and will expand the method to validate other relationship links including spouses and siblings.https://ijpds.org/article/view/864
spellingShingle Amy O'Hara
Katie Genadek
Carla Medalia
Trent Alexander
Reconciling Parent-Child Relationships across US Administrative Datasets
International Journal of Population Data Science
title Reconciling Parent-Child Relationships across US Administrative Datasets
title_full Reconciling Parent-Child Relationships across US Administrative Datasets
title_fullStr Reconciling Parent-Child Relationships across US Administrative Datasets
title_full_unstemmed Reconciling Parent-Child Relationships across US Administrative Datasets
title_short Reconciling Parent-Child Relationships across US Administrative Datasets
title_sort reconciling parent child relationships across us administrative datasets
url https://ijpds.org/article/view/864
work_keys_str_mv AT amyohara reconcilingparentchildrelationshipsacrossusadministrativedatasets
AT katiegenadek reconcilingparentchildrelationshipsacrossusadministrativedatasets
AT carlamedalia reconcilingparentchildrelationshipsacrossusadministrativedatasets
AT trentalexander reconcilingparentchildrelationshipsacrossusadministrativedatasets