Effects of environment, genetics and data analysis pitfalls in an esophageal cancer genome-wide association study.

The development of new high-throughput genotyping technologies has allowed fast evaluation of single nucleotide polymorphisms (SNPs) on a genome-wide scale. Several recent genome-wide association studies employing these technologies suggest that panels of SNPs can be a useful tool for predicting can...

Full description

Bibliographic Details
Main Authors: Alexander Statnikov, Chun Li, Constantin F Aliferis
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2007-09-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC1978529?pdf=render
_version_ 1818331218699091968
author Alexander Statnikov
Chun Li
Constantin F Aliferis
author_facet Alexander Statnikov
Chun Li
Constantin F Aliferis
author_sort Alexander Statnikov
collection DOAJ
description The development of new high-throughput genotyping technologies has allowed fast evaluation of single nucleotide polymorphisms (SNPs) on a genome-wide scale. Several recent genome-wide association studies employing these technologies suggest that panels of SNPs can be a useful tool for predicting cancer susceptibility and discovery of potentially important new disease loci.In the present paper we undertake a careful examination of the relative significance of genetics, environmental factors, and biases of the data analysis protocol that was used in a previously published genome-wide association study. That prior study reported a nearly perfect discrimination of esophageal cancer patients and healthy controls on the basis of only genetic information. On the other hand, our results strongly suggest that SNPs in this dataset are not statistically linked to the phenotype, while several environmental factors and especially family history of esophageal cancer (a proxy to both environmental and genetic factors) have only a modest association with the disease.The main component of the previously claimed strong discriminatory signal is due to several data analysis pitfalls that in combination led to the strongly optimistic results. Such pitfalls are preventable and should be avoided in future studies since they create misleading conclusions and generate many false leads for subsequent research.
first_indexed 2024-12-13T13:16:21Z
format Article
id doaj.art-5f24aef157b24bc0b14c73e132f291b7
institution Directory Open Access Journal
issn 1932-6203
language English
last_indexed 2024-12-13T13:16:21Z
publishDate 2007-09-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj.art-5f24aef157b24bc0b14c73e132f291b72022-12-21T23:44:32ZengPublic Library of Science (PLoS)PLoS ONE1932-62032007-09-0129e95810.1371/journal.pone.0000958Effects of environment, genetics and data analysis pitfalls in an esophageal cancer genome-wide association study.Alexander StatnikovChun LiConstantin F AliferisThe development of new high-throughput genotyping technologies has allowed fast evaluation of single nucleotide polymorphisms (SNPs) on a genome-wide scale. Several recent genome-wide association studies employing these technologies suggest that panels of SNPs can be a useful tool for predicting cancer susceptibility and discovery of potentially important new disease loci.In the present paper we undertake a careful examination of the relative significance of genetics, environmental factors, and biases of the data analysis protocol that was used in a previously published genome-wide association study. That prior study reported a nearly perfect discrimination of esophageal cancer patients and healthy controls on the basis of only genetic information. On the other hand, our results strongly suggest that SNPs in this dataset are not statistically linked to the phenotype, while several environmental factors and especially family history of esophageal cancer (a proxy to both environmental and genetic factors) have only a modest association with the disease.The main component of the previously claimed strong discriminatory signal is due to several data analysis pitfalls that in combination led to the strongly optimistic results. Such pitfalls are preventable and should be avoided in future studies since they create misleading conclusions and generate many false leads for subsequent research.http://europepmc.org/articles/PMC1978529?pdf=render
spellingShingle Alexander Statnikov
Chun Li
Constantin F Aliferis
Effects of environment, genetics and data analysis pitfalls in an esophageal cancer genome-wide association study.
PLoS ONE
title Effects of environment, genetics and data analysis pitfalls in an esophageal cancer genome-wide association study.
title_full Effects of environment, genetics and data analysis pitfalls in an esophageal cancer genome-wide association study.
title_fullStr Effects of environment, genetics and data analysis pitfalls in an esophageal cancer genome-wide association study.
title_full_unstemmed Effects of environment, genetics and data analysis pitfalls in an esophageal cancer genome-wide association study.
title_short Effects of environment, genetics and data analysis pitfalls in an esophageal cancer genome-wide association study.
title_sort effects of environment genetics and data analysis pitfalls in an esophageal cancer genome wide association study
url http://europepmc.org/articles/PMC1978529?pdf=render
work_keys_str_mv AT alexanderstatnikov effectsofenvironmentgeneticsanddataanalysispitfallsinanesophagealcancergenomewideassociationstudy
AT chunli effectsofenvironmentgeneticsanddataanalysispitfallsinanesophagealcancergenomewideassociationstudy
AT constantinfaliferis effectsofenvironmentgeneticsanddataanalysispitfallsinanesophagealcancergenomewideassociationstudy