ASTK: A Machine Learning‐Based Integrative Software for Alternative Splicing Analysis

Alternative splicing (AS) is a fundamental mechanism that regulates gene expressionin both physiological and pathological processes. This article introduces ASTK, a software package covering upstream and downstream analysis of AS. Initially, ASTK offers a module to perform enrichment analysis at bot...

Full description

Bibliographic Details
Main Authors: Shenghui Huang, Jiangshuang He, Lei Yu, Jun Guo, Shangying Jiang, Zhaoxia Sun, Linghui Cheng, Xing Chen, Xiang Ji, Yi Zhang
Format: Article
Language:English
Published: Wiley 2024-04-01
Series:Advanced Intelligent Systems
Subjects:
Online Access:https://doi.org/10.1002/aisy.202300594
_version_ 1797197603289432064
author Shenghui Huang
Jiangshuang He
Lei Yu
Jun Guo
Shangying Jiang
Zhaoxia Sun
Linghui Cheng
Xing Chen
Xiang Ji
Yi Zhang
author_facet Shenghui Huang
Jiangshuang He
Lei Yu
Jun Guo
Shangying Jiang
Zhaoxia Sun
Linghui Cheng
Xing Chen
Xiang Ji
Yi Zhang
author_sort Shenghui Huang
collection DOAJ
description Alternative splicing (AS) is a fundamental mechanism that regulates gene expressionin both physiological and pathological processes. This article introduces ASTK, a software package covering upstream and downstream analysis of AS. Initially, ASTK offers a module to perform enrichment analysis at both the gene‐ and exon‐level to incorporate various impacts by different spliced events on a single gene. We further cluster AS genes and alternative exons into three groups based on spliced exon sizes (micro‐, mid‐, and macro‐), which are preferentially associated with distinct biological pathways. A major challenge in the field has been decoding the regulatory codes of splicing. ASTK adeptly extracts both sequence features and epigenetic marks associated with AS events. Through the application of machine learning algorithms, we identified pivotal features influencing the inclusion levels of most AS types. Notably, the splice site strength is a primary determinant for the inclusion levels in alternative 3’/5’ splice sites (A3/A5). For the alternative first exon and skipping exon classes, a combination of sequence and epigenetic features collaboratively dictate exon inclusion/exclusion. Our findings underscore ASTK's capability to enhance the functional understanding of AS events and shed light on the intricacies of splicing regulation.
first_indexed 2024-04-24T06:46:35Z
format Article
id doaj.art-6409eba008de4d1e8cd2651b11015380
institution Directory Open Access Journal
issn 2640-4567
language English
last_indexed 2024-04-24T06:46:35Z
publishDate 2024-04-01
publisher Wiley
record_format Article
series Advanced Intelligent Systems
spelling doaj.art-6409eba008de4d1e8cd2651b110153802024-04-22T18:07:16ZengWileyAdvanced Intelligent Systems2640-45672024-04-0164n/an/a10.1002/aisy.202300594ASTK: A Machine Learning‐Based Integrative Software for Alternative Splicing AnalysisShenghui Huang0Jiangshuang He1Lei Yu2Jun Guo3Shangying Jiang4Zhaoxia Sun5Linghui Cheng6Xing Chen7Xiang Ji8Yi Zhang9Zhejiang Provincial Key Laboratory of Medical Genetics Key Laboratory of Laboratory Medicine Ministry of Education, China School of Laboratory Medicine and Life Science Wenzhou Medical University Wenzhou Zhejiang Province 325035 ChinaZhejiang Provincial Key Laboratory of Medical Genetics Key Laboratory of Laboratory Medicine Ministry of Education, China School of Laboratory Medicine and Life Science Wenzhou Medical University Wenzhou Zhejiang Province 325035 ChinaZhejiang Provincial Key Laboratory of Medical Genetics Key Laboratory of Laboratory Medicine Ministry of Education, China School of Laboratory Medicine and Life Science Wenzhou Medical University Wenzhou Zhejiang Province 325035 ChinaZhejiang Provincial Key Laboratory of Medical Genetics Key Laboratory of Laboratory Medicine Ministry of Education, China School of Laboratory Medicine and Life Science Wenzhou Medical University Wenzhou Zhejiang Province 325035 ChinaZhejiang Provincial Key Laboratory of Medical Genetics Key Laboratory of Laboratory Medicine Ministry of Education, China School of Laboratory Medicine and Life Science Wenzhou Medical University Wenzhou Zhejiang Province 325035 ChinaZhejiang Provincial Key Laboratory of Medical Genetics Key Laboratory of Laboratory Medicine Ministry of Education, China School of Laboratory Medicine and Life Science Wenzhou Medical University Wenzhou Zhejiang Province 325035 ChinaScientific Research Center Wenzhou Medical University Wenzhou Zhejiang Province 325035 ChinaSchool of Informatics University of Edinburgh Edinburgh EH8 9AB UKDepartment of Mathematics School of Science & Engineering Tulane University New Orleans 70118 LA USAZhejiang Provincial Key Laboratory of Medical Genetics Key Laboratory of Laboratory Medicine Ministry of Education, China School of Laboratory Medicine and Life Science Wenzhou Medical University Wenzhou Zhejiang Province 325035 ChinaAlternative splicing (AS) is a fundamental mechanism that regulates gene expressionin both physiological and pathological processes. This article introduces ASTK, a software package covering upstream and downstream analysis of AS. Initially, ASTK offers a module to perform enrichment analysis at both the gene‐ and exon‐level to incorporate various impacts by different spliced events on a single gene. We further cluster AS genes and alternative exons into three groups based on spliced exon sizes (micro‐, mid‐, and macro‐), which are preferentially associated with distinct biological pathways. A major challenge in the field has been decoding the regulatory codes of splicing. ASTK adeptly extracts both sequence features and epigenetic marks associated with AS events. Through the application of machine learning algorithms, we identified pivotal features influencing the inclusion levels of most AS types. Notably, the splice site strength is a primary determinant for the inclusion levels in alternative 3’/5’ splice sites (A3/A5). For the alternative first exon and skipping exon classes, a combination of sequence and epigenetic features collaboratively dictate exon inclusion/exclusion. Our findings underscore ASTK's capability to enhance the functional understanding of AS events and shed light on the intricacies of splicing regulation.https://doi.org/10.1002/aisy.202300594alternative splicingepigenetic marksfunctional enrichmentmachine learningsequence featuressplicing codes
spellingShingle Shenghui Huang
Jiangshuang He
Lei Yu
Jun Guo
Shangying Jiang
Zhaoxia Sun
Linghui Cheng
Xing Chen
Xiang Ji
Yi Zhang
ASTK: A Machine Learning‐Based Integrative Software for Alternative Splicing Analysis
Advanced Intelligent Systems
alternative splicing
epigenetic marks
functional enrichment
machine learning
sequence features
splicing codes
title ASTK: A Machine Learning‐Based Integrative Software for Alternative Splicing Analysis
title_full ASTK: A Machine Learning‐Based Integrative Software for Alternative Splicing Analysis
title_fullStr ASTK: A Machine Learning‐Based Integrative Software for Alternative Splicing Analysis
title_full_unstemmed ASTK: A Machine Learning‐Based Integrative Software for Alternative Splicing Analysis
title_short ASTK: A Machine Learning‐Based Integrative Software for Alternative Splicing Analysis
title_sort astk a machine learning based integrative software for alternative splicing analysis
topic alternative splicing
epigenetic marks
functional enrichment
machine learning
sequence features
splicing codes
url https://doi.org/10.1002/aisy.202300594
work_keys_str_mv AT shenghuihuang astkamachinelearningbasedintegrativesoftwareforalternativesplicinganalysis
AT jiangshuanghe astkamachinelearningbasedintegrativesoftwareforalternativesplicinganalysis
AT leiyu astkamachinelearningbasedintegrativesoftwareforalternativesplicinganalysis
AT junguo astkamachinelearningbasedintegrativesoftwareforalternativesplicinganalysis
AT shangyingjiang astkamachinelearningbasedintegrativesoftwareforalternativesplicinganalysis
AT zhaoxiasun astkamachinelearningbasedintegrativesoftwareforalternativesplicinganalysis
AT linghuicheng astkamachinelearningbasedintegrativesoftwareforalternativesplicinganalysis
AT xingchen astkamachinelearningbasedintegrativesoftwareforalternativesplicinganalysis
AT xiangji astkamachinelearningbasedintegrativesoftwareforalternativesplicinganalysis
AT yizhang astkamachinelearningbasedintegrativesoftwareforalternativesplicinganalysis