Linking Synthetic Populations to Household Geolocations: A Demonstration in Namibia

Whether evaluating gridded population dataset estimates (e.g., WorldPop, LandScan) or household survey sample designs, a population census linked to residential locations are needed. Geolocated census microdata data, however, are almost never available and are thus best simulated. In this paper, we...

Full description

Bibliographic Details
Main Authors: Dana R. Thomson, Lieke Kools, Warren C. Jochem
Format: Article
Language:English
Published: MDPI AG 2018-08-01
Series:Data
Subjects:
Online Access:http://www.mdpi.com/2306-5729/3/3/30
_version_ 1811185946107838464
author Dana R. Thomson
Lieke Kools
Warren C. Jochem
author_facet Dana R. Thomson
Lieke Kools
Warren C. Jochem
author_sort Dana R. Thomson
collection DOAJ
description Whether evaluating gridded population dataset estimates (e.g., WorldPop, LandScan) or household survey sample designs, a population census linked to residential locations are needed. Geolocated census microdata data, however, are almost never available and are thus best simulated. In this paper, we simulate a close-to-reality population of individuals nested in households geolocated to realistic building locations. Using the R simPop package and ArcGIS, multiple realizations of a geolocated synthetic population are derived from the Namibia 2011 census 20% microdata sample, Namibia census enumeration area boundaries, Namibia 2013 Demographic and Health Survey (DHS), and dozens of spatial covariates derived from publicly available datasets. Realistic household latitude-longitude coordinates are manually generated based on public satellite imagery. Simulated households are linked to latitude-longitude coordinates by identifying distinct household types with multivariate k-means analysis and modelling a probability surface for each household type using Random Forest machine learning methods. We simulate five realizations of a synthetic population in Namibia’s Oshikoto region, including demographic, socioeconomic, and outcome characteristics at the level of household, woman, and child. Comparison of variables in the synthetic population were made with 2011 census 20% sample and 2013 DHS data by primary sampling unit/enumeration area. We found that synthetic population variable distributions matched observed observations and followed expected spatial patterns. We outline a novel process to simulate a close-to-reality microdata census geolocated to realistic building locations in a low- or middle-income country setting to support spatial demographic research and survey methodological development while avoiding disclosure risk of individuals.
first_indexed 2024-04-11T13:37:57Z
format Article
id doaj.art-c1c65360bbe04ea8ad22e699d0e3a0c8
institution Directory Open Access Journal
issn 2306-5729
language English
last_indexed 2024-04-11T13:37:57Z
publishDate 2018-08-01
publisher MDPI AG
record_format Article
series Data
spelling doaj.art-c1c65360bbe04ea8ad22e699d0e3a0c82022-12-22T04:21:23ZengMDPI AGData2306-57292018-08-01333010.3390/data3030030data3030030Linking Synthetic Populations to Household Geolocations: A Demonstration in NamibiaDana R. Thomson0Lieke Kools1Warren C. Jochem2Flowminder Foundation, SE-11355 Stockholm, SwedenDepartment of Economics, Leiden University, 2311 EZ Leiden, The NetherlandsFlowminder Foundation, SE-11355 Stockholm, SwedenWhether evaluating gridded population dataset estimates (e.g., WorldPop, LandScan) or household survey sample designs, a population census linked to residential locations are needed. Geolocated census microdata data, however, are almost never available and are thus best simulated. In this paper, we simulate a close-to-reality population of individuals nested in households geolocated to realistic building locations. Using the R simPop package and ArcGIS, multiple realizations of a geolocated synthetic population are derived from the Namibia 2011 census 20% microdata sample, Namibia census enumeration area boundaries, Namibia 2013 Demographic and Health Survey (DHS), and dozens of spatial covariates derived from publicly available datasets. Realistic household latitude-longitude coordinates are manually generated based on public satellite imagery. Simulated households are linked to latitude-longitude coordinates by identifying distinct household types with multivariate k-means analysis and modelling a probability surface for each household type using Random Forest machine learning methods. We simulate five realizations of a synthetic population in Namibia’s Oshikoto region, including demographic, socioeconomic, and outcome characteristics at the level of household, woman, and child. Comparison of variables in the synthetic population were made with 2011 census 20% sample and 2013 DHS data by primary sampling unit/enumeration area. We found that synthetic population variable distributions matched observed observations and followed expected spatial patterns. We outline a novel process to simulate a close-to-reality microdata census geolocated to realistic building locations in a low- or middle-income country setting to support spatial demographic research and survey methodological development while avoiding disclosure risk of individuals.http://www.mdpi.com/2306-5729/3/3/30simulationcensussimPopLMIC
spellingShingle Dana R. Thomson
Lieke Kools
Warren C. Jochem
Linking Synthetic Populations to Household Geolocations: A Demonstration in Namibia
Data
simulation
census
simPop
LMIC
title Linking Synthetic Populations to Household Geolocations: A Demonstration in Namibia
title_full Linking Synthetic Populations to Household Geolocations: A Demonstration in Namibia
title_fullStr Linking Synthetic Populations to Household Geolocations: A Demonstration in Namibia
title_full_unstemmed Linking Synthetic Populations to Household Geolocations: A Demonstration in Namibia
title_short Linking Synthetic Populations to Household Geolocations: A Demonstration in Namibia
title_sort linking synthetic populations to household geolocations a demonstration in namibia
topic simulation
census
simPop
LMIC
url http://www.mdpi.com/2306-5729/3/3/30
work_keys_str_mv AT danarthomson linkingsyntheticpopulationstohouseholdgeolocationsademonstrationinnamibia
AT liekekools linkingsyntheticpopulationstohouseholdgeolocationsademonstrationinnamibia
AT warrencjochem linkingsyntheticpopulationstohouseholdgeolocationsademonstrationinnamibia