Social Data Linkage Environment

ABSTRACT Objectives The Social Data Linkage Environment (SDLE) at Statistics Canada promotes the innovative use of existing administrative and survey data to address important research questions and inform socio-economic policy through record linkage. It expands the potential of data integration ac...

Full description

Bibliographic Details
Main Author: Richard Trudeau
Format: Article
Language:English
Published: Swansea University 2017-04-01
Series:International Journal of Population Data Science
Online Access:https://ijpds.org/article/view/76
_version_ 1797430226376982528
author Richard Trudeau
author_facet Richard Trudeau
author_sort Richard Trudeau
collection DOAJ
description ABSTRACT Objectives The Social Data Linkage Environment (SDLE) at Statistics Canada promotes the innovative use of existing administrative and survey data to address important research questions and inform socio-economic policy through record linkage. It expands the potential of data integration across multiple domains, such as health, justice, education and income, through the creation of linked analytical data files without the need to collect additional data from Canadians. Approach At the core of the SDLE is a Derived Record Depository (DRD), essentially a national dynamic relational data base containing only basic personal identifiers. The DRD is created by linking selected Statistics Canada source index files for the purpose of producing a list of unique individuals. These files are brought into the environment, processed and linked only once to the DRD. Each individual in the DRD is assigned an SDLE identifier. Some of the source index files used to build the DRD include tax records, vital statistics registration records (births and deaths), and immigrant data. Updates to these data files are linked to the DRD on an ongoing basis. Only basic personal identifiers are stored in the DRD. Examples of personal identifiers stored in the DRD include surnames, given names, date of birth, sex, insurance numbers, parents' names, marital status, addresses (including postal codes), telephone numbers, immigration date, emigration date and date of death. The paired SDLE identifiers and source index file record IDs resulting from the record linkage are stored in a Key Registry. To reduce the risk of privacy intrusiveness and to minimize the risk of disclosure, source files are separated into source index files and source data files. Employees performing the record linkages in SDLE have access to only the basic personal identifiers needed for linkage. Employees who build the analytical files for research have access only to the data stripped of personal identifiers. Results The SDLE is a highly secure environment that facilitates the creation of linked population data files for social analysis. It is not a large integrated data base. Conclusion The SDLE program facilitates pan-Canadian social and economic statistical research. It is a record linkage environment that: increases the relevance of existing surveys without collecting new data; substantially increases the use of administrative data; generates new information without additional data collection; maintains the highest privacy and data security standards; and promotes a standardized approach to record linkage processes and methods.
first_indexed 2024-03-09T09:24:33Z
format Article
id doaj.art-f291f1d5288e4678a901eb0d15c1f72c
institution Directory Open Access Journal
issn 2399-4908
language English
last_indexed 2024-03-09T09:24:33Z
publishDate 2017-04-01
publisher Swansea University
record_format Article
series International Journal of Population Data Science
spelling doaj.art-f291f1d5288e4678a901eb0d15c1f72c2023-12-02T06:26:29ZengSwansea UniversityInternational Journal of Population Data Science2399-49082017-04-011110.23889/ijpds.v1i1.7676Social Data Linkage EnvironmentRichard Trudeau0Statistics CanadaABSTRACT Objectives The Social Data Linkage Environment (SDLE) at Statistics Canada promotes the innovative use of existing administrative and survey data to address important research questions and inform socio-economic policy through record linkage. It expands the potential of data integration across multiple domains, such as health, justice, education and income, through the creation of linked analytical data files without the need to collect additional data from Canadians. Approach At the core of the SDLE is a Derived Record Depository (DRD), essentially a national dynamic relational data base containing only basic personal identifiers. The DRD is created by linking selected Statistics Canada source index files for the purpose of producing a list of unique individuals. These files are brought into the environment, processed and linked only once to the DRD. Each individual in the DRD is assigned an SDLE identifier. Some of the source index files used to build the DRD include tax records, vital statistics registration records (births and deaths), and immigrant data. Updates to these data files are linked to the DRD on an ongoing basis. Only basic personal identifiers are stored in the DRD. Examples of personal identifiers stored in the DRD include surnames, given names, date of birth, sex, insurance numbers, parents' names, marital status, addresses (including postal codes), telephone numbers, immigration date, emigration date and date of death. The paired SDLE identifiers and source index file record IDs resulting from the record linkage are stored in a Key Registry. To reduce the risk of privacy intrusiveness and to minimize the risk of disclosure, source files are separated into source index files and source data files. Employees performing the record linkages in SDLE have access to only the basic personal identifiers needed for linkage. Employees who build the analytical files for research have access only to the data stripped of personal identifiers. Results The SDLE is a highly secure environment that facilitates the creation of linked population data files for social analysis. It is not a large integrated data base. Conclusion The SDLE program facilitates pan-Canadian social and economic statistical research. It is a record linkage environment that: increases the relevance of existing surveys without collecting new data; substantially increases the use of administrative data; generates new information without additional data collection; maintains the highest privacy and data security standards; and promotes a standardized approach to record linkage processes and methods.https://ijpds.org/article/view/76
spellingShingle Richard Trudeau
Social Data Linkage Environment
International Journal of Population Data Science
title Social Data Linkage Environment
title_full Social Data Linkage Environment
title_fullStr Social Data Linkage Environment
title_full_unstemmed Social Data Linkage Environment
title_short Social Data Linkage Environment
title_sort social data linkage environment
url https://ijpds.org/article/view/76
work_keys_str_mv AT richardtrudeau socialdatalinkageenvironment