Powerful eQTL mapping through low-coverage RNA sequencing
Summary: Mapping genetic variants that regulate gene expression (eQTL mapping) in large-scale RNA sequencing (RNA-seq) studies is often employed to understand functional consequences of regulatory variants. However, the high cost of RNA-seq limits sample size, sequencing depth, and, therefore, disco...
Main Authors: | , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2022-07-01
|
Series: | HGG Advances |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S2666247722000197 |
_version_ | 1818015115317870592 |
---|---|
author | Tommer Schwarz Toni Boltz Kangcheng Hou Merel Bot Chenda Duan Loes Olde Loohuis Marco P. Boks René S. Kahn Roel A. Ophoff Bogdan Pasaniuc |
author_facet | Tommer Schwarz Toni Boltz Kangcheng Hou Merel Bot Chenda Duan Loes Olde Loohuis Marco P. Boks René S. Kahn Roel A. Ophoff Bogdan Pasaniuc |
author_sort | Tommer Schwarz |
collection | DOAJ |
description | Summary: Mapping genetic variants that regulate gene expression (eQTL mapping) in large-scale RNA sequencing (RNA-seq) studies is often employed to understand functional consequences of regulatory variants. However, the high cost of RNA-seq limits sample size, sequencing depth, and, therefore, discovery power in eQTL studies. In this work, we demonstrate that, given a fixed budget, eQTL discovery power can be increased by lowering the sequencing depth per sample and increasing the number of individuals sequenced in the assay. We perform RNA-seq of whole-blood tissue across 1,490 individuals at low coverage (5.9 million reads/sample) and show that the effective power is higher than that of an RNA-seq study of 570 individuals at moderate coverage (13.9 million reads/sample). Next, we leverage synthetic datasets derived from real RNA-seq data (50 million reads/sample) to explore the interplay of coverage and number individuals in eQTL studies, and show that a 10-fold reduction in coverage leads to only a 2.5-fold reduction in statistical power to identify eQTLs. Our work suggests that lowering coverage while increasing the number of individuals in RNA-seq is an effective approach to increase discovery power in eQTL studies. |
first_indexed | 2024-04-14T06:52:45Z |
format | Article |
id | doaj.art-dd3d82f998fd44a3a02e42c16647ba33 |
institution | Directory Open Access Journal |
issn | 2666-2477 |
language | English |
last_indexed | 2024-04-14T06:52:45Z |
publishDate | 2022-07-01 |
publisher | Elsevier |
record_format | Article |
series | HGG Advances |
spelling | doaj.art-dd3d82f998fd44a3a02e42c16647ba332022-12-22T02:06:58ZengElsevierHGG Advances2666-24772022-07-0133100103Powerful eQTL mapping through low-coverage RNA sequencingTommer Schwarz0Toni Boltz1Kangcheng Hou2Merel Bot3Chenda Duan4Loes Olde Loohuis5Marco P. Boks6René S. Kahn7Roel A. Ophoff8Bogdan Pasaniuc9Bioinformatics Interdepartmental Program, University of California Los Angeles, Los Angeles, CA, USA; Department of Computational Medicine, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA; Corresponding authorDepartment of Human Genetics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USABioinformatics Interdepartmental Program, University of California Los Angeles, Los Angeles, CA, USACenter for Neurobehavioral Genetics, Semel Institute for Neuroscience and Human Behavior, University of California, Los Angeles, CA, USADepartment of Computer Science, University of California, Los Angeles, Los Angeles, CA, USADepartment of Psychiatry, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA; Program in Neurobehavioral Genetics, Semel Institute, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USADepartment of Psychiatry, Brain Center University Medical Center Utrecht, University Utrecht, Utrecht, the NetherlandsDepartment of Psychiatry, Brain Center University Medical Center Utrecht, University Utrecht, Utrecht, the Netherlands; Department of Psychiatry, Icahn School of Medicine, Mount Sinai, NY, USABioinformatics Interdepartmental Program, University of California Los Angeles, Los Angeles, CA, USA; Department of Human Genetics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA; Department of Psychiatry, Erasmus University Medical Center, Rotterdam, the NetherlandsBioinformatics Interdepartmental Program, University of California Los Angeles, Los Angeles, CA, USA; Department of Computational Medicine, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA; Department of Human Genetics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA; Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA; Corresponding authorSummary: Mapping genetic variants that regulate gene expression (eQTL mapping) in large-scale RNA sequencing (RNA-seq) studies is often employed to understand functional consequences of regulatory variants. However, the high cost of RNA-seq limits sample size, sequencing depth, and, therefore, discovery power in eQTL studies. In this work, we demonstrate that, given a fixed budget, eQTL discovery power can be increased by lowering the sequencing depth per sample and increasing the number of individuals sequenced in the assay. We perform RNA-seq of whole-blood tissue across 1,490 individuals at low coverage (5.9 million reads/sample) and show that the effective power is higher than that of an RNA-seq study of 570 individuals at moderate coverage (13.9 million reads/sample). Next, we leverage synthetic datasets derived from real RNA-seq data (50 million reads/sample) to explore the interplay of coverage and number individuals in eQTL studies, and show that a 10-fold reduction in coverage leads to only a 2.5-fold reduction in statistical power to identify eQTLs. Our work suggests that lowering coverage while increasing the number of individuals in RNA-seq is an effective approach to increase discovery power in eQTL studies.http://www.sciencedirect.com/science/article/pii/S2666247722000197RNA-seqeQTL mappingassociation testinglow coverage |
spellingShingle | Tommer Schwarz Toni Boltz Kangcheng Hou Merel Bot Chenda Duan Loes Olde Loohuis Marco P. Boks René S. Kahn Roel A. Ophoff Bogdan Pasaniuc Powerful eQTL mapping through low-coverage RNA sequencing HGG Advances RNA-seq eQTL mapping association testing low coverage |
title | Powerful eQTL mapping through low-coverage RNA sequencing |
title_full | Powerful eQTL mapping through low-coverage RNA sequencing |
title_fullStr | Powerful eQTL mapping through low-coverage RNA sequencing |
title_full_unstemmed | Powerful eQTL mapping through low-coverage RNA sequencing |
title_short | Powerful eQTL mapping through low-coverage RNA sequencing |
title_sort | powerful eqtl mapping through low coverage rna sequencing |
topic | RNA-seq eQTL mapping association testing low coverage |
url | http://www.sciencedirect.com/science/article/pii/S2666247722000197 |
work_keys_str_mv | AT tommerschwarz powerfuleqtlmappingthroughlowcoveragernasequencing AT toniboltz powerfuleqtlmappingthroughlowcoveragernasequencing AT kangchenghou powerfuleqtlmappingthroughlowcoveragernasequencing AT merelbot powerfuleqtlmappingthroughlowcoveragernasequencing AT chendaduan powerfuleqtlmappingthroughlowcoveragernasequencing AT loesoldeloohuis powerfuleqtlmappingthroughlowcoveragernasequencing AT marcopboks powerfuleqtlmappingthroughlowcoveragernasequencing AT reneskahn powerfuleqtlmappingthroughlowcoveragernasequencing AT roelaophoff powerfuleqtlmappingthroughlowcoveragernasequencing AT bogdanpasaniuc powerfuleqtlmappingthroughlowcoveragernasequencing |