Accurate ethnicity prediction from placental DNA methylation data

Abstract Background The influence of genetics on variation in DNA methylation (DNAme) is well documented. Yet confounding from population stratification is often unaccounted for in DNAme association studies. Existing approaches to address confounding by population stratification using DNAme data may...

Full description

Bibliographic Details
Main Authors: Victor Yuan, E. Magda Price, Giulia Del Gobbo, Sara Mostafavi, Brian Cox, Alexandra M. Binder, Karin B. Michels, Carmen Marsit, Wendy P. Robinson
Format: Article
Language:English
Published: BMC 2019-08-01
Series:Epigenetics & Chromatin
Subjects:
Online Access:http://link.springer.com/article/10.1186/s13072-019-0296-3
_version_ 1818149427453362176
author Victor Yuan
E. Magda Price
Giulia Del Gobbo
Sara Mostafavi
Brian Cox
Alexandra M. Binder
Karin B. Michels
Carmen Marsit
Wendy P. Robinson
author_facet Victor Yuan
E. Magda Price
Giulia Del Gobbo
Sara Mostafavi
Brian Cox
Alexandra M. Binder
Karin B. Michels
Carmen Marsit
Wendy P. Robinson
author_sort Victor Yuan
collection DOAJ
description Abstract Background The influence of genetics on variation in DNA methylation (DNAme) is well documented. Yet confounding from population stratification is often unaccounted for in DNAme association studies. Existing approaches to address confounding by population stratification using DNAme data may not generalize to populations or tissues outside those in which they were developed. To aid future placental DNAme studies in assessing population stratification, we developed an ethnicity classifier, PlaNET (Placental DNAme Elastic Net Ethnicity Tool), using five cohorts with Infinium Human Methylation 450k BeadChip array (HM450k) data from placental samples that is also compatible with the newer EPIC platform. Results Data from 509 placental samples were used to develop PlaNET and show that it accurately predicts (accuracy = 0.938, kappa = 0.823) major classes of self-reported ethnicity/race (African: n = 58, Asian: n = 53, Caucasian: n = 389), and produces ethnicity probabilities that are highly correlated with genetic ancestry inferred from genome-wide SNP arrays (> 2.5 million SNP) and ancestry informative markers (n = 50 SNPs). PlaNET’s ethnicity classification relies on 1860 HM450K microarray sites, and over half of these were linked to nearby genetic polymorphisms (n = 955). Our placental-optimized method outperforms existing approaches in assessing population stratification in placental samples from individuals of Asian, African, and Caucasian ethnicities. Conclusion PlaNET provides an improved approach to address population stratification in placental DNAme association studies. The method can be applied to predict ethnicity as a discrete or continuous variable and will be especially useful when self-reported ethnicity information is missing and genotyping markers are unavailable.
first_indexed 2024-12-11T13:06:52Z
format Article
id doaj.art-02c4ae74a7e34e4682b3456b469c8329
institution Directory Open Access Journal
issn 1756-8935
language English
last_indexed 2024-12-11T13:06:52Z
publishDate 2019-08-01
publisher BMC
record_format Article
series Epigenetics & Chromatin
spelling doaj.art-02c4ae74a7e34e4682b3456b469c83292022-12-22T01:06:17ZengBMCEpigenetics & Chromatin1756-89352019-08-0112111410.1186/s13072-019-0296-3Accurate ethnicity prediction from placental DNA methylation dataVictor Yuan0E. Magda Price1Giulia Del Gobbo2Sara Mostafavi3Brian Cox4Alexandra M. Binder5Karin B. Michels6Carmen Marsit7Wendy P. Robinson8Department of Medical Genetics, University of British ColumbiaDepartment of Medical Genetics, University of British ColumbiaDepartment of Medical Genetics, University of British ColumbiaDepartment of Medical Genetics, University of British ColumbiaDepartment of Physiology, University of TorontoDepartment of Epidemiology, Fielding School of Public Health, University of CaliforniaDepartment of Epidemiology, Fielding School of Public Health, University of CaliforniaDepartment of Environmental Health, Emory UniversityDepartment of Medical Genetics, University of British ColumbiaAbstract Background The influence of genetics on variation in DNA methylation (DNAme) is well documented. Yet confounding from population stratification is often unaccounted for in DNAme association studies. Existing approaches to address confounding by population stratification using DNAme data may not generalize to populations or tissues outside those in which they were developed. To aid future placental DNAme studies in assessing population stratification, we developed an ethnicity classifier, PlaNET (Placental DNAme Elastic Net Ethnicity Tool), using five cohorts with Infinium Human Methylation 450k BeadChip array (HM450k) data from placental samples that is also compatible with the newer EPIC platform. Results Data from 509 placental samples were used to develop PlaNET and show that it accurately predicts (accuracy = 0.938, kappa = 0.823) major classes of self-reported ethnicity/race (African: n = 58, Asian: n = 53, Caucasian: n = 389), and produces ethnicity probabilities that are highly correlated with genetic ancestry inferred from genome-wide SNP arrays (> 2.5 million SNP) and ancestry informative markers (n = 50 SNPs). PlaNET’s ethnicity classification relies on 1860 HM450K microarray sites, and over half of these were linked to nearby genetic polymorphisms (n = 955). Our placental-optimized method outperforms existing approaches in assessing population stratification in placental samples from individuals of Asian, African, and Caucasian ethnicities. Conclusion PlaNET provides an improved approach to address population stratification in placental DNAme association studies. The method can be applied to predict ethnicity as a discrete or continuous variable and will be especially useful when self-reported ethnicity information is missing and genotyping markers are unavailable.http://link.springer.com/article/10.1186/s13072-019-0296-3EpigeneticsDNA methylationMicroarrayPlacentaPopulation stratificationMachine learning
spellingShingle Victor Yuan
E. Magda Price
Giulia Del Gobbo
Sara Mostafavi
Brian Cox
Alexandra M. Binder
Karin B. Michels
Carmen Marsit
Wendy P. Robinson
Accurate ethnicity prediction from placental DNA methylation data
Epigenetics & Chromatin
Epigenetics
DNA methylation
Microarray
Placenta
Population stratification
Machine learning
title Accurate ethnicity prediction from placental DNA methylation data
title_full Accurate ethnicity prediction from placental DNA methylation data
title_fullStr Accurate ethnicity prediction from placental DNA methylation data
title_full_unstemmed Accurate ethnicity prediction from placental DNA methylation data
title_short Accurate ethnicity prediction from placental DNA methylation data
title_sort accurate ethnicity prediction from placental dna methylation data
topic Epigenetics
DNA methylation
Microarray
Placenta
Population stratification
Machine learning
url http://link.springer.com/article/10.1186/s13072-019-0296-3
work_keys_str_mv AT victoryuan accurateethnicitypredictionfromplacentaldnamethylationdata
AT emagdaprice accurateethnicitypredictionfromplacentaldnamethylationdata
AT giuliadelgobbo accurateethnicitypredictionfromplacentaldnamethylationdata
AT saramostafavi accurateethnicitypredictionfromplacentaldnamethylationdata
AT briancox accurateethnicitypredictionfromplacentaldnamethylationdata
AT alexandrambinder accurateethnicitypredictionfromplacentaldnamethylationdata
AT karinbmichels accurateethnicitypredictionfromplacentaldnamethylationdata
AT carmenmarsit accurateethnicitypredictionfromplacentaldnamethylationdata
AT wendyprobinson accurateethnicitypredictionfromplacentaldnamethylationdata