Accurate ethnicity prediction from placental DNA methylation data
Abstract Background The influence of genetics on variation in DNA methylation (DNAme) is well documented. Yet confounding from population stratification is often unaccounted for in DNAme association studies. Existing approaches to address confounding by population stratification using DNAme data may...
Main Authors: | , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2019-08-01
|
Series: | Epigenetics & Chromatin |
Subjects: | |
Online Access: | http://link.springer.com/article/10.1186/s13072-019-0296-3 |
_version_ | 1818149427453362176 |
---|---|
author | Victor Yuan E. Magda Price Giulia Del Gobbo Sara Mostafavi Brian Cox Alexandra M. Binder Karin B. Michels Carmen Marsit Wendy P. Robinson |
author_facet | Victor Yuan E. Magda Price Giulia Del Gobbo Sara Mostafavi Brian Cox Alexandra M. Binder Karin B. Michels Carmen Marsit Wendy P. Robinson |
author_sort | Victor Yuan |
collection | DOAJ |
description | Abstract Background The influence of genetics on variation in DNA methylation (DNAme) is well documented. Yet confounding from population stratification is often unaccounted for in DNAme association studies. Existing approaches to address confounding by population stratification using DNAme data may not generalize to populations or tissues outside those in which they were developed. To aid future placental DNAme studies in assessing population stratification, we developed an ethnicity classifier, PlaNET (Placental DNAme Elastic Net Ethnicity Tool), using five cohorts with Infinium Human Methylation 450k BeadChip array (HM450k) data from placental samples that is also compatible with the newer EPIC platform. Results Data from 509 placental samples were used to develop PlaNET and show that it accurately predicts (accuracy = 0.938, kappa = 0.823) major classes of self-reported ethnicity/race (African: n = 58, Asian: n = 53, Caucasian: n = 389), and produces ethnicity probabilities that are highly correlated with genetic ancestry inferred from genome-wide SNP arrays (> 2.5 million SNP) and ancestry informative markers (n = 50 SNPs). PlaNET’s ethnicity classification relies on 1860 HM450K microarray sites, and over half of these were linked to nearby genetic polymorphisms (n = 955). Our placental-optimized method outperforms existing approaches in assessing population stratification in placental samples from individuals of Asian, African, and Caucasian ethnicities. Conclusion PlaNET provides an improved approach to address population stratification in placental DNAme association studies. The method can be applied to predict ethnicity as a discrete or continuous variable and will be especially useful when self-reported ethnicity information is missing and genotyping markers are unavailable. |
first_indexed | 2024-12-11T13:06:52Z |
format | Article |
id | doaj.art-02c4ae74a7e34e4682b3456b469c8329 |
institution | Directory Open Access Journal |
issn | 1756-8935 |
language | English |
last_indexed | 2024-12-11T13:06:52Z |
publishDate | 2019-08-01 |
publisher | BMC |
record_format | Article |
series | Epigenetics & Chromatin |
spelling | doaj.art-02c4ae74a7e34e4682b3456b469c83292022-12-22T01:06:17ZengBMCEpigenetics & Chromatin1756-89352019-08-0112111410.1186/s13072-019-0296-3Accurate ethnicity prediction from placental DNA methylation dataVictor Yuan0E. Magda Price1Giulia Del Gobbo2Sara Mostafavi3Brian Cox4Alexandra M. Binder5Karin B. Michels6Carmen Marsit7Wendy P. Robinson8Department of Medical Genetics, University of British ColumbiaDepartment of Medical Genetics, University of British ColumbiaDepartment of Medical Genetics, University of British ColumbiaDepartment of Medical Genetics, University of British ColumbiaDepartment of Physiology, University of TorontoDepartment of Epidemiology, Fielding School of Public Health, University of CaliforniaDepartment of Epidemiology, Fielding School of Public Health, University of CaliforniaDepartment of Environmental Health, Emory UniversityDepartment of Medical Genetics, University of British ColumbiaAbstract Background The influence of genetics on variation in DNA methylation (DNAme) is well documented. Yet confounding from population stratification is often unaccounted for in DNAme association studies. Existing approaches to address confounding by population stratification using DNAme data may not generalize to populations or tissues outside those in which they were developed. To aid future placental DNAme studies in assessing population stratification, we developed an ethnicity classifier, PlaNET (Placental DNAme Elastic Net Ethnicity Tool), using five cohorts with Infinium Human Methylation 450k BeadChip array (HM450k) data from placental samples that is also compatible with the newer EPIC platform. Results Data from 509 placental samples were used to develop PlaNET and show that it accurately predicts (accuracy = 0.938, kappa = 0.823) major classes of self-reported ethnicity/race (African: n = 58, Asian: n = 53, Caucasian: n = 389), and produces ethnicity probabilities that are highly correlated with genetic ancestry inferred from genome-wide SNP arrays (> 2.5 million SNP) and ancestry informative markers (n = 50 SNPs). PlaNET’s ethnicity classification relies on 1860 HM450K microarray sites, and over half of these were linked to nearby genetic polymorphisms (n = 955). Our placental-optimized method outperforms existing approaches in assessing population stratification in placental samples from individuals of Asian, African, and Caucasian ethnicities. Conclusion PlaNET provides an improved approach to address population stratification in placental DNAme association studies. The method can be applied to predict ethnicity as a discrete or continuous variable and will be especially useful when self-reported ethnicity information is missing and genotyping markers are unavailable.http://link.springer.com/article/10.1186/s13072-019-0296-3EpigeneticsDNA methylationMicroarrayPlacentaPopulation stratificationMachine learning |
spellingShingle | Victor Yuan E. Magda Price Giulia Del Gobbo Sara Mostafavi Brian Cox Alexandra M. Binder Karin B. Michels Carmen Marsit Wendy P. Robinson Accurate ethnicity prediction from placental DNA methylation data Epigenetics & Chromatin Epigenetics DNA methylation Microarray Placenta Population stratification Machine learning |
title | Accurate ethnicity prediction from placental DNA methylation data |
title_full | Accurate ethnicity prediction from placental DNA methylation data |
title_fullStr | Accurate ethnicity prediction from placental DNA methylation data |
title_full_unstemmed | Accurate ethnicity prediction from placental DNA methylation data |
title_short | Accurate ethnicity prediction from placental DNA methylation data |
title_sort | accurate ethnicity prediction from placental dna methylation data |
topic | Epigenetics DNA methylation Microarray Placenta Population stratification Machine learning |
url | http://link.springer.com/article/10.1186/s13072-019-0296-3 |
work_keys_str_mv | AT victoryuan accurateethnicitypredictionfromplacentaldnamethylationdata AT emagdaprice accurateethnicitypredictionfromplacentaldnamethylationdata AT giuliadelgobbo accurateethnicitypredictionfromplacentaldnamethylationdata AT saramostafavi accurateethnicitypredictionfromplacentaldnamethylationdata AT briancox accurateethnicitypredictionfromplacentaldnamethylationdata AT alexandrambinder accurateethnicitypredictionfromplacentaldnamethylationdata AT karinbmichels accurateethnicitypredictionfromplacentaldnamethylationdata AT carmenmarsit accurateethnicitypredictionfromplacentaldnamethylationdata AT wendyprobinson accurateethnicitypredictionfromplacentaldnamethylationdata |