A Machine Learning Approach to Predicting Autism Risk Genes: Validation of Known Genes and Discovery of New Candidates
Autism spectrum disorder (ASD) is a complex neurodevelopmental condition with a strong genetic basis. The role of de novo mutations in ASD has been well established, but the set of genes implicated to date is still far from complete. The current study employs a machine learning-based approach to pre...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2020-09-01
|
Series: | Frontiers in Genetics |
Subjects: | |
Online Access: | https://www.frontiersin.org/article/10.3389/fgene.2020.500064/full |
_version_ | 1818112838941540352 |
---|---|
author | Ying Lin Shiva Afshar Anjali M. Rajadhyaksha Anjali M. Rajadhyaksha Anjali M. Rajadhyaksha James B. Potash Shizhong Han Shizhong Han |
author_facet | Ying Lin Shiva Afshar Anjali M. Rajadhyaksha Anjali M. Rajadhyaksha Anjali M. Rajadhyaksha James B. Potash Shizhong Han Shizhong Han |
author_sort | Ying Lin |
collection | DOAJ |
description | Autism spectrum disorder (ASD) is a complex neurodevelopmental condition with a strong genetic basis. The role of de novo mutations in ASD has been well established, but the set of genes implicated to date is still far from complete. The current study employs a machine learning-based approach to predict ASD risk genes using features from spatiotemporal gene expression patterns in human brain, gene-level constraint metrics, and other gene variation features. The genes identified through our prediction model were enriched for independent sets of ASD risk genes, and tended to be down-expressed in ASD brains, especially in frontal and parietal cortex. The highest-ranked genes not only included those with strong prior evidence for involvement in ASD (for example, NBEA, HERC1, and TCF20), but also indicated potentially novel candidates, such as, MYCBP2 and CAND1, which are involved in protein ubiquitination. We also showed that our method outperformed state-of-the-art scoring systems for ranking curated ASD candidate genes. Gene ontology enrichment analysis of our predicted risk genes revealed biological processes clearly relevant to ASD, including neuronal signaling, neurogenesis, and chromatin remodeling, but also highlighted other potential mechanisms that might underlie ASD, such as regulation of RNA alternative splicing and ubiquitination pathway related to protein degradation. Our study demonstrates that human brain spatiotemporal gene expression patterns and gene-level constraint metrics can help predict ASD risk genes. Our gene ranking system provides a useful resource for prioritizing ASD candidate genes. |
first_indexed | 2024-12-11T03:25:18Z |
format | Article |
id | doaj.art-668731df287a4d50a6197530ad1e1daf |
institution | Directory Open Access Journal |
issn | 1664-8021 |
language | English |
last_indexed | 2024-12-11T03:25:18Z |
publishDate | 2020-09-01 |
publisher | Frontiers Media S.A. |
record_format | Article |
series | Frontiers in Genetics |
spelling | doaj.art-668731df287a4d50a6197530ad1e1daf2022-12-22T01:22:32ZengFrontiers Media S.A.Frontiers in Genetics1664-80212020-09-011110.3389/fgene.2020.500064500064A Machine Learning Approach to Predicting Autism Risk Genes: Validation of Known Genes and Discovery of New CandidatesYing Lin0Shiva Afshar1Anjali M. Rajadhyaksha2Anjali M. Rajadhyaksha3Anjali M. Rajadhyaksha4James B. Potash5Shizhong Han6Shizhong Han7Department of Industrial Engineering, University of Houston, Houston, TX, United StatesDepartment of Industrial Engineering, University of Houston, Houston, TX, United StatesDivision of Pediatric Neurology, Department of Pediatrics, Weill Cornell Medicine, New York, NY, United StatesFeil Family Brain & Mind Research Institute, Weill Cornell Medicine, New York, NY, United StatesWeill Cornell Autism Research Program, Weill Cornell Medicine, New York, NY, United StatesDepartment of Psychiatry and Behavioral Sciences, Johns Hopkins School of Medicine, Baltimore, MD, United StatesDepartment of Psychiatry and Behavioral Sciences, Johns Hopkins School of Medicine, Baltimore, MD, United StatesLieber Institute for Brain Development, Baltimore, MD, United StatesAutism spectrum disorder (ASD) is a complex neurodevelopmental condition with a strong genetic basis. The role of de novo mutations in ASD has been well established, but the set of genes implicated to date is still far from complete. The current study employs a machine learning-based approach to predict ASD risk genes using features from spatiotemporal gene expression patterns in human brain, gene-level constraint metrics, and other gene variation features. The genes identified through our prediction model were enriched for independent sets of ASD risk genes, and tended to be down-expressed in ASD brains, especially in frontal and parietal cortex. The highest-ranked genes not only included those with strong prior evidence for involvement in ASD (for example, NBEA, HERC1, and TCF20), but also indicated potentially novel candidates, such as, MYCBP2 and CAND1, which are involved in protein ubiquitination. We also showed that our method outperformed state-of-the-art scoring systems for ranking curated ASD candidate genes. Gene ontology enrichment analysis of our predicted risk genes revealed biological processes clearly relevant to ASD, including neuronal signaling, neurogenesis, and chromatin remodeling, but also highlighted other potential mechanisms that might underlie ASD, such as regulation of RNA alternative splicing and ubiquitination pathway related to protein degradation. Our study demonstrates that human brain spatiotemporal gene expression patterns and gene-level constraint metrics can help predict ASD risk genes. Our gene ranking system provides a useful resource for prioritizing ASD candidate genes.https://www.frontiersin.org/article/10.3389/fgene.2020.500064/fullautismde novo mutationgene expressionconstraintmachine learning |
spellingShingle | Ying Lin Shiva Afshar Anjali M. Rajadhyaksha Anjali M. Rajadhyaksha Anjali M. Rajadhyaksha James B. Potash Shizhong Han Shizhong Han A Machine Learning Approach to Predicting Autism Risk Genes: Validation of Known Genes and Discovery of New Candidates Frontiers in Genetics autism de novo mutation gene expression constraint machine learning |
title | A Machine Learning Approach to Predicting Autism Risk Genes: Validation of Known Genes and Discovery of New Candidates |
title_full | A Machine Learning Approach to Predicting Autism Risk Genes: Validation of Known Genes and Discovery of New Candidates |
title_fullStr | A Machine Learning Approach to Predicting Autism Risk Genes: Validation of Known Genes and Discovery of New Candidates |
title_full_unstemmed | A Machine Learning Approach to Predicting Autism Risk Genes: Validation of Known Genes and Discovery of New Candidates |
title_short | A Machine Learning Approach to Predicting Autism Risk Genes: Validation of Known Genes and Discovery of New Candidates |
title_sort | machine learning approach to predicting autism risk genes validation of known genes and discovery of new candidates |
topic | autism de novo mutation gene expression constraint machine learning |
url | https://www.frontiersin.org/article/10.3389/fgene.2020.500064/full |
work_keys_str_mv | AT yinglin amachinelearningapproachtopredictingautismriskgenesvalidationofknowngenesanddiscoveryofnewcandidates AT shivaafshar amachinelearningapproachtopredictingautismriskgenesvalidationofknowngenesanddiscoveryofnewcandidates AT anjalimrajadhyaksha amachinelearningapproachtopredictingautismriskgenesvalidationofknowngenesanddiscoveryofnewcandidates AT anjalimrajadhyaksha amachinelearningapproachtopredictingautismriskgenesvalidationofknowngenesanddiscoveryofnewcandidates AT anjalimrajadhyaksha amachinelearningapproachtopredictingautismriskgenesvalidationofknowngenesanddiscoveryofnewcandidates AT jamesbpotash amachinelearningapproachtopredictingautismriskgenesvalidationofknowngenesanddiscoveryofnewcandidates AT shizhonghan amachinelearningapproachtopredictingautismriskgenesvalidationofknowngenesanddiscoveryofnewcandidates AT shizhonghan amachinelearningapproachtopredictingautismriskgenesvalidationofknowngenesanddiscoveryofnewcandidates AT yinglin machinelearningapproachtopredictingautismriskgenesvalidationofknowngenesanddiscoveryofnewcandidates AT shivaafshar machinelearningapproachtopredictingautismriskgenesvalidationofknowngenesanddiscoveryofnewcandidates AT anjalimrajadhyaksha machinelearningapproachtopredictingautismriskgenesvalidationofknowngenesanddiscoveryofnewcandidates AT anjalimrajadhyaksha machinelearningapproachtopredictingautismriskgenesvalidationofknowngenesanddiscoveryofnewcandidates AT anjalimrajadhyaksha machinelearningapproachtopredictingautismriskgenesvalidationofknowngenesanddiscoveryofnewcandidates AT jamesbpotash machinelearningapproachtopredictingautismriskgenesvalidationofknowngenesanddiscoveryofnewcandidates AT shizhonghan machinelearningapproachtopredictingautismriskgenesvalidationofknowngenesanddiscoveryofnewcandidates AT shizhonghan machinelearningapproachtopredictingautismriskgenesvalidationofknowngenesanddiscoveryofnewcandidates |