Mendelian randomization with fine‐mapped genetic data: choosing from large numbers of correlated instrumental variables

Mendelian randomization uses genetic variants to make causal inferences about the effect of a risk factor on an outcome. With fine‐mapped genetic data, there may be hundreds of genetic variants in a single gene region any of which could be used to assess this causal relationship. However, using too...

Full description

Bibliographic Details
Main Authors: Burgess, S, Zuber, V, Valdez-Marquez, E, Sun, B, Hopewell, J
Format: Journal article
Published: Wiley 2017
_version_ 1797090102753624064
author Burgess, S
Zuber, V
Valdez-Marquez, E
Sun, B
Hopewell, J
author_facet Burgess, S
Zuber, V
Valdez-Marquez, E
Sun, B
Hopewell, J
author_sort Burgess, S
collection OXFORD
description Mendelian randomization uses genetic variants to make causal inferences about the effect of a risk factor on an outcome. With fine‐mapped genetic data, there may be hundreds of genetic variants in a single gene region any of which could be used to assess this causal relationship. However, using too many genetic variants in the analysis can lead to spurious estimates and inflated Type 1 error rates. But if only a few genetic variants are used, then the majority of the data is ignored and estimates are highly sensitive to the particular choice of variants. We propose an approach based on summarized data only (genetic association and correlation estimates) that uses principal components analysis to form instruments. This approach has desirable theoretical properties: it takes the totality of data into account and does not suffer from numerical instabilities. It also has good properties in simulation studies: it is not particularly sensitive to varying the genetic variants included in the analysis or the genetic correlation matrix, and it does not have greatly inflated Type 1 error rates. Overall, the method gives estimates that are less precise than those from variable selection approaches (such as using a conditional analysis or pruning approach to select variants), but are more robust to seemingly arbitrary choices in the variable selection step. Methods are illustrated by an example using genetic associations with testosterone for 320 genetic variants to assess the effect of sex hormone related pathways on coronary artery disease risk, in which variable selection approaches give inconsistent inferences.
first_indexed 2024-03-07T03:13:41Z
format Journal article
id oxford-uuid:b51229ac-2c74-4b23-ab4b-9a0409f7a047
institution University of Oxford
last_indexed 2024-03-07T03:13:41Z
publishDate 2017
publisher Wiley
record_format dspace
spelling oxford-uuid:b51229ac-2c74-4b23-ab4b-9a0409f7a0472022-03-27T04:30:40ZMendelian randomization with fine‐mapped genetic data: choosing from large numbers of correlated instrumental variablesJournal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:b51229ac-2c74-4b23-ab4b-9a0409f7a047Symplectic Elements at OxfordWiley2017Burgess, SZuber, VValdez-Marquez, ESun, BHopewell, J Mendelian randomization uses genetic variants to make causal inferences about the effect of a risk factor on an outcome. With fine‐mapped genetic data, there may be hundreds of genetic variants in a single gene region any of which could be used to assess this causal relationship. However, using too many genetic variants in the analysis can lead to spurious estimates and inflated Type 1 error rates. But if only a few genetic variants are used, then the majority of the data is ignored and estimates are highly sensitive to the particular choice of variants. We propose an approach based on summarized data only (genetic association and correlation estimates) that uses principal components analysis to form instruments. This approach has desirable theoretical properties: it takes the totality of data into account and does not suffer from numerical instabilities. It also has good properties in simulation studies: it is not particularly sensitive to varying the genetic variants included in the analysis or the genetic correlation matrix, and it does not have greatly inflated Type 1 error rates. Overall, the method gives estimates that are less precise than those from variable selection approaches (such as using a conditional analysis or pruning approach to select variants), but are more robust to seemingly arbitrary choices in the variable selection step. Methods are illustrated by an example using genetic associations with testosterone for 320 genetic variants to assess the effect of sex hormone related pathways on coronary artery disease risk, in which variable selection approaches give inconsistent inferences.
spellingShingle Burgess, S
Zuber, V
Valdez-Marquez, E
Sun, B
Hopewell, J
Mendelian randomization with fine‐mapped genetic data: choosing from large numbers of correlated instrumental variables
title Mendelian randomization with fine‐mapped genetic data: choosing from large numbers of correlated instrumental variables
title_full Mendelian randomization with fine‐mapped genetic data: choosing from large numbers of correlated instrumental variables
title_fullStr Mendelian randomization with fine‐mapped genetic data: choosing from large numbers of correlated instrumental variables
title_full_unstemmed Mendelian randomization with fine‐mapped genetic data: choosing from large numbers of correlated instrumental variables
title_short Mendelian randomization with fine‐mapped genetic data: choosing from large numbers of correlated instrumental variables
title_sort mendelian randomization with fine mapped genetic data choosing from large numbers of correlated instrumental variables
work_keys_str_mv AT burgesss mendelianrandomizationwithfinemappedgeneticdatachoosingfromlargenumbersofcorrelatedinstrumentalvariables
AT zuberv mendelianrandomizationwithfinemappedgeneticdatachoosingfromlargenumbersofcorrelatedinstrumentalvariables
AT valdezmarqueze mendelianrandomizationwithfinemappedgeneticdatachoosingfromlargenumbersofcorrelatedinstrumentalvariables
AT sunb mendelianrandomizationwithfinemappedgeneticdatachoosingfromlargenumbersofcorrelatedinstrumentalvariables
AT hopewellj mendelianrandomizationwithfinemappedgeneticdatachoosingfromlargenumbersofcorrelatedinstrumentalvariables