Application of Causal Inference to Genomic Analysis: Advances in Methodology

The current paradigm of genomic studies of complex diseases is association and correlation analysis. Despite significant progress in dissecting the genetic architecture of complex diseases by genome-wide association studies (GWAS), the identified genetic variants by GWAS can only explain a small pro...

Full description

Bibliographic Details
Main Authors: Pengfei Hu, Rong Jiao, Li Jin, Momiao Xiong
Format: Article
Language:English
Published: Frontiers Media S.A. 2018-07-01
Series:Frontiers in Genetics
Subjects:
Online Access:https://www.frontiersin.org/article/10.3389/fgene.2018.00238/full
_version_ 1831773589862350848
author Pengfei Hu
Rong Jiao
Li Jin
Li Jin
Momiao Xiong
author_facet Pengfei Hu
Rong Jiao
Li Jin
Li Jin
Momiao Xiong
author_sort Pengfei Hu
collection DOAJ
description The current paradigm of genomic studies of complex diseases is association and correlation analysis. Despite significant progress in dissecting the genetic architecture of complex diseases by genome-wide association studies (GWAS), the identified genetic variants by GWAS can only explain a small proportion of the heritability of complex diseases. A large fraction of genetic variants is still hidden. Association analysis has limited power to unravel mechanisms of complex diseases. It is time to shift the paradigm of genomic analysis from association analysis to causal inference. Causal inference is an essential component for the discovery of mechanism of diseases. This paper will review the major platforms of the genomic analysis in the past and discuss the perspectives of causal inference as a general framework of genomic analysis. In genomic data analysis, we usually consider four types of associations: association of discrete variables (DNA variation) with continuous variables (phenotypes and gene expressions), association of continuous variables (expressions, methylations, and imaging signals) with continuous variables (gene expressions, imaging signals, phenotypes, and physiological traits), association of discrete variables (DNA variation) with binary trait (disease status) and association of continuous variables (gene expressions, methylations, phenotypes, and imaging signals) with binary trait (disease status). In this paper, we will review algorithmic information theory as a general framework for causal discovery and the recent development of statistical methods for causal inference on discrete data, and discuss the possibility of extending the association analysis of discrete variable with disease to the causal analysis for discrete variable and disease.
first_indexed 2024-12-22T08:39:05Z
format Article
id doaj.art-b33bde915728466887fb7c28be773ec1
institution Directory Open Access Journal
issn 1664-8021
language English
last_indexed 2024-12-22T08:39:05Z
publishDate 2018-07-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Genetics
spelling doaj.art-b33bde915728466887fb7c28be773ec12022-12-21T18:32:17ZengFrontiers Media S.A.Frontiers in Genetics1664-80212018-07-01910.3389/fgene.2018.00238374509Application of Causal Inference to Genomic Analysis: Advances in MethodologyPengfei Hu0Rong Jiao1Li Jin2Li Jin3Momiao Xiong4Ministry of Education Key Laboratory of Contemporary Anthropology, School of Life Sciences, Fudan University, Shanghai, ChinaDepartment of Biostatistics and Data Science, School of Public Health, University of Texas Health Science Center at Houston, Houston, TX, United StatesState Key Laboratory of Genetic Engineering, Collaborative Innovation Center for Genetics and Development, School of Life Sciences, Fudan University, Shanghai, ChinaHuman Phenome Institute, Fudan University, Shanghai, ChinaDepartment of Biostatistics and Data Science, School of Public Health, University of Texas Health Science Center at Houston, Houston, TX, United StatesThe current paradigm of genomic studies of complex diseases is association and correlation analysis. Despite significant progress in dissecting the genetic architecture of complex diseases by genome-wide association studies (GWAS), the identified genetic variants by GWAS can only explain a small proportion of the heritability of complex diseases. A large fraction of genetic variants is still hidden. Association analysis has limited power to unravel mechanisms of complex diseases. It is time to shift the paradigm of genomic analysis from association analysis to causal inference. Causal inference is an essential component for the discovery of mechanism of diseases. This paper will review the major platforms of the genomic analysis in the past and discuss the perspectives of causal inference as a general framework of genomic analysis. In genomic data analysis, we usually consider four types of associations: association of discrete variables (DNA variation) with continuous variables (phenotypes and gene expressions), association of continuous variables (expressions, methylations, and imaging signals) with continuous variables (gene expressions, imaging signals, phenotypes, and physiological traits), association of discrete variables (DNA variation) with binary trait (disease status) and association of continuous variables (gene expressions, methylations, phenotypes, and imaging signals) with binary trait (disease status). In this paper, we will review algorithmic information theory as a general framework for causal discovery and the recent development of statistical methods for causal inference on discrete data, and discuss the possibility of extending the association analysis of discrete variable with disease to the causal analysis for discrete variable and disease.https://www.frontiersin.org/article/10.3389/fgene.2018.00238/fullcausal inferencegenomic analysisadditive noise models for discrete variablesassociation analysisentropy
spellingShingle Pengfei Hu
Rong Jiao
Li Jin
Li Jin
Momiao Xiong
Application of Causal Inference to Genomic Analysis: Advances in Methodology
Frontiers in Genetics
causal inference
genomic analysis
additive noise models for discrete variables
association analysis
entropy
title Application of Causal Inference to Genomic Analysis: Advances in Methodology
title_full Application of Causal Inference to Genomic Analysis: Advances in Methodology
title_fullStr Application of Causal Inference to Genomic Analysis: Advances in Methodology
title_full_unstemmed Application of Causal Inference to Genomic Analysis: Advances in Methodology
title_short Application of Causal Inference to Genomic Analysis: Advances in Methodology
title_sort application of causal inference to genomic analysis advances in methodology
topic causal inference
genomic analysis
additive noise models for discrete variables
association analysis
entropy
url https://www.frontiersin.org/article/10.3389/fgene.2018.00238/full
work_keys_str_mv AT pengfeihu applicationofcausalinferencetogenomicanalysisadvancesinmethodology
AT rongjiao applicationofcausalinferencetogenomicanalysisadvancesinmethodology
AT lijin applicationofcausalinferencetogenomicanalysisadvancesinmethodology
AT lijin applicationofcausalinferencetogenomicanalysisadvancesinmethodology
AT momiaoxiong applicationofcausalinferencetogenomicanalysisadvancesinmethodology