Powerful eQTL mapping through low-coverage RNA sequencing

Summary: Mapping genetic variants that regulate gene expression (eQTL mapping) in large-scale RNA sequencing (RNA-seq) studies is often employed to understand functional consequences of regulatory variants. However, the high cost of RNA-seq limits sample size, sequencing depth, and, therefore, disco...

Full description

Bibliographic Details
Main Authors: Tommer Schwarz, Toni Boltz, Kangcheng Hou, Merel Bot, Chenda Duan, Loes Olde Loohuis, Marco P. Boks, René S. Kahn, Roel A. Ophoff, Bogdan Pasaniuc
Format: Article
Language:English
Published: Elsevier 2022-07-01
Series:HGG Advances
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2666247722000197
_version_ 1818015115317870592
author Tommer Schwarz
Toni Boltz
Kangcheng Hou
Merel Bot
Chenda Duan
Loes Olde Loohuis
Marco P. Boks
René S. Kahn
Roel A. Ophoff
Bogdan Pasaniuc
author_facet Tommer Schwarz
Toni Boltz
Kangcheng Hou
Merel Bot
Chenda Duan
Loes Olde Loohuis
Marco P. Boks
René S. Kahn
Roel A. Ophoff
Bogdan Pasaniuc
author_sort Tommer Schwarz
collection DOAJ
description Summary: Mapping genetic variants that regulate gene expression (eQTL mapping) in large-scale RNA sequencing (RNA-seq) studies is often employed to understand functional consequences of regulatory variants. However, the high cost of RNA-seq limits sample size, sequencing depth, and, therefore, discovery power in eQTL studies. In this work, we demonstrate that, given a fixed budget, eQTL discovery power can be increased by lowering the sequencing depth per sample and increasing the number of individuals sequenced in the assay. We perform RNA-seq of whole-blood tissue across 1,490 individuals at low coverage (5.9 million reads/sample) and show that the effective power is higher than that of an RNA-seq study of 570 individuals at moderate coverage (13.9 million reads/sample). Next, we leverage synthetic datasets derived from real RNA-seq data (50 million reads/sample) to explore the interplay of coverage and number individuals in eQTL studies, and show that a 10-fold reduction in coverage leads to only a 2.5-fold reduction in statistical power to identify eQTLs. Our work suggests that lowering coverage while increasing the number of individuals in RNA-seq is an effective approach to increase discovery power in eQTL studies.
first_indexed 2024-04-14T06:52:45Z
format Article
id doaj.art-dd3d82f998fd44a3a02e42c16647ba33
institution Directory Open Access Journal
issn 2666-2477
language English
last_indexed 2024-04-14T06:52:45Z
publishDate 2022-07-01
publisher Elsevier
record_format Article
series HGG Advances
spelling doaj.art-dd3d82f998fd44a3a02e42c16647ba332022-12-22T02:06:58ZengElsevierHGG Advances2666-24772022-07-0133100103Powerful eQTL mapping through low-coverage RNA sequencingTommer Schwarz0Toni Boltz1Kangcheng Hou2Merel Bot3Chenda Duan4Loes Olde Loohuis5Marco P. Boks6René S. Kahn7Roel A. Ophoff8Bogdan Pasaniuc9Bioinformatics Interdepartmental Program, University of California Los Angeles, Los Angeles, CA, USA; Department of Computational Medicine, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA; Corresponding authorDepartment of Human Genetics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USABioinformatics Interdepartmental Program, University of California Los Angeles, Los Angeles, CA, USACenter for Neurobehavioral Genetics, Semel Institute for Neuroscience and Human Behavior, University of California, Los Angeles, CA, USADepartment of Computer Science, University of California, Los Angeles, Los Angeles, CA, USADepartment of Psychiatry, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA; Program in Neurobehavioral Genetics, Semel Institute, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USADepartment of Psychiatry, Brain Center University Medical Center Utrecht, University Utrecht, Utrecht, the NetherlandsDepartment of Psychiatry, Brain Center University Medical Center Utrecht, University Utrecht, Utrecht, the Netherlands; Department of Psychiatry, Icahn School of Medicine, Mount Sinai, NY, USABioinformatics Interdepartmental Program, University of California Los Angeles, Los Angeles, CA, USA; Department of Human Genetics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA; Department of Psychiatry, Erasmus University Medical Center, Rotterdam, the NetherlandsBioinformatics Interdepartmental Program, University of California Los Angeles, Los Angeles, CA, USA; Department of Computational Medicine, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA; Department of Human Genetics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA; Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA; Corresponding authorSummary: Mapping genetic variants that regulate gene expression (eQTL mapping) in large-scale RNA sequencing (RNA-seq) studies is often employed to understand functional consequences of regulatory variants. However, the high cost of RNA-seq limits sample size, sequencing depth, and, therefore, discovery power in eQTL studies. In this work, we demonstrate that, given a fixed budget, eQTL discovery power can be increased by lowering the sequencing depth per sample and increasing the number of individuals sequenced in the assay. We perform RNA-seq of whole-blood tissue across 1,490 individuals at low coverage (5.9 million reads/sample) and show that the effective power is higher than that of an RNA-seq study of 570 individuals at moderate coverage (13.9 million reads/sample). Next, we leverage synthetic datasets derived from real RNA-seq data (50 million reads/sample) to explore the interplay of coverage and number individuals in eQTL studies, and show that a 10-fold reduction in coverage leads to only a 2.5-fold reduction in statistical power to identify eQTLs. Our work suggests that lowering coverage while increasing the number of individuals in RNA-seq is an effective approach to increase discovery power in eQTL studies.http://www.sciencedirect.com/science/article/pii/S2666247722000197RNA-seqeQTL mappingassociation testinglow coverage
spellingShingle Tommer Schwarz
Toni Boltz
Kangcheng Hou
Merel Bot
Chenda Duan
Loes Olde Loohuis
Marco P. Boks
René S. Kahn
Roel A. Ophoff
Bogdan Pasaniuc
Powerful eQTL mapping through low-coverage RNA sequencing
HGG Advances
RNA-seq
eQTL mapping
association testing
low coverage
title Powerful eQTL mapping through low-coverage RNA sequencing
title_full Powerful eQTL mapping through low-coverage RNA sequencing
title_fullStr Powerful eQTL mapping through low-coverage RNA sequencing
title_full_unstemmed Powerful eQTL mapping through low-coverage RNA sequencing
title_short Powerful eQTL mapping through low-coverage RNA sequencing
title_sort powerful eqtl mapping through low coverage rna sequencing
topic RNA-seq
eQTL mapping
association testing
low coverage
url http://www.sciencedirect.com/science/article/pii/S2666247722000197
work_keys_str_mv AT tommerschwarz powerfuleqtlmappingthroughlowcoveragernasequencing
AT toniboltz powerfuleqtlmappingthroughlowcoveragernasequencing
AT kangchenghou powerfuleqtlmappingthroughlowcoveragernasequencing
AT merelbot powerfuleqtlmappingthroughlowcoveragernasequencing
AT chendaduan powerfuleqtlmappingthroughlowcoveragernasequencing
AT loesoldeloohuis powerfuleqtlmappingthroughlowcoveragernasequencing
AT marcopboks powerfuleqtlmappingthroughlowcoveragernasequencing
AT reneskahn powerfuleqtlmappingthroughlowcoveragernasequencing
AT roelaophoff powerfuleqtlmappingthroughlowcoveragernasequencing
AT bogdanpasaniuc powerfuleqtlmappingthroughlowcoveragernasequencing