GC-AG Introns Features in Long Non-coding and Protein-Coding Genes Suggest Their Role in Gene Expression Regulation

Long non-coding RNAs (lncRNAs) are recognized as an important class of regulatory molecules involved in a variety of biological functions. However, the regulatory mechanisms of long non-coding genes expression are still poorly understood. The characterization of the genomic features of lncRNAs is cr...

Full description

Bibliographic Details
Main Authors: Monah Abou Alezz, Ludovica Celli, Giulia Belotti, Antonella Lisa, Silvia Bione
Format: Article
Language:English
Published: Frontiers Media S.A. 2020-05-01
Series:Frontiers in Genetics
Subjects:
Online Access:https://www.frontiersin.org/article/10.3389/fgene.2020.00488/full
_version_ 1818430697791029248
author Monah Abou Alezz
Ludovica Celli
Giulia Belotti
Antonella Lisa
Silvia Bione
author_facet Monah Abou Alezz
Ludovica Celli
Giulia Belotti
Antonella Lisa
Silvia Bione
author_sort Monah Abou Alezz
collection DOAJ
description Long non-coding RNAs (lncRNAs) are recognized as an important class of regulatory molecules involved in a variety of biological functions. However, the regulatory mechanisms of long non-coding genes expression are still poorly understood. The characterization of the genomic features of lncRNAs is crucial to get insight into their function. In this study, we exploited recent annotations by GENCODE to characterize the genomic and splicing features of long non-coding genes in comparison with protein-coding ones, both in human and mouse. Our analysis highlighted differences between the two classes of genes in terms of their gene architecture. Significant differences in the splice sites usage were observed between long non-coding and protein-coding genes (PCG). While the frequency of non-canonical GC-AG splice junctions represents about 0.8% of total splice sites in PCGs, we identified a significant enrichment of the GC-AG splice sites in long non-coding genes, both in human (3.0%) and mouse (1.9%). In addition, we found a positional bias of GC-AG splice sites being enriched in the first intron in both classes of genes. Moreover, a significant shorter length and weaker donor and acceptor sites were found comparing GC-AG introns to GT-AG introns. Genes containing at least one GC-AG intron were found conserved in many species, more prone to alternative splicing and a functional analysis pointed toward their enrichment in specific biological processes such as DNA repair. Our study shows for the first time that GC-AG introns are mainly associated with lncRNAs and are preferentially located in the first intron. Additionally, we discovered their regulatory potential indicating the existence of a new mechanism of non-coding and PCGs expression regulation.
first_indexed 2024-12-14T15:37:32Z
format Article
id doaj.art-33578e41855541dc85da390c32440510
institution Directory Open Access Journal
issn 1664-8021
language English
last_indexed 2024-12-14T15:37:32Z
publishDate 2020-05-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Genetics
spelling doaj.art-33578e41855541dc85da390c324405102022-12-21T22:55:41ZengFrontiers Media S.A.Frontiers in Genetics1664-80212020-05-011110.3389/fgene.2020.00488519976GC-AG Introns Features in Long Non-coding and Protein-Coding Genes Suggest Their Role in Gene Expression RegulationMonah Abou AlezzLudovica CelliGiulia BelottiAntonella LisaSilvia BioneLong non-coding RNAs (lncRNAs) are recognized as an important class of regulatory molecules involved in a variety of biological functions. However, the regulatory mechanisms of long non-coding genes expression are still poorly understood. The characterization of the genomic features of lncRNAs is crucial to get insight into their function. In this study, we exploited recent annotations by GENCODE to characterize the genomic and splicing features of long non-coding genes in comparison with protein-coding ones, both in human and mouse. Our analysis highlighted differences between the two classes of genes in terms of their gene architecture. Significant differences in the splice sites usage were observed between long non-coding and protein-coding genes (PCG). While the frequency of non-canonical GC-AG splice junctions represents about 0.8% of total splice sites in PCGs, we identified a significant enrichment of the GC-AG splice sites in long non-coding genes, both in human (3.0%) and mouse (1.9%). In addition, we found a positional bias of GC-AG splice sites being enriched in the first intron in both classes of genes. Moreover, a significant shorter length and weaker donor and acceptor sites were found comparing GC-AG introns to GT-AG introns. Genes containing at least one GC-AG intron were found conserved in many species, more prone to alternative splicing and a functional analysis pointed toward their enrichment in specific biological processes such as DNA repair. Our study shows for the first time that GC-AG introns are mainly associated with lncRNAs and are preferentially located in the first intron. Additionally, we discovered their regulatory potential indicating the existence of a new mechanism of non-coding and PCGs expression regulation.https://www.frontiersin.org/article/10.3389/fgene.2020.00488/fullGC-AG intronslong non-coding RNAssplice junctionsfirst intronalternative splicing
spellingShingle Monah Abou Alezz
Ludovica Celli
Giulia Belotti
Antonella Lisa
Silvia Bione
GC-AG Introns Features in Long Non-coding and Protein-Coding Genes Suggest Their Role in Gene Expression Regulation
Frontiers in Genetics
GC-AG introns
long non-coding RNAs
splice junctions
first intron
alternative splicing
title GC-AG Introns Features in Long Non-coding and Protein-Coding Genes Suggest Their Role in Gene Expression Regulation
title_full GC-AG Introns Features in Long Non-coding and Protein-Coding Genes Suggest Their Role in Gene Expression Regulation
title_fullStr GC-AG Introns Features in Long Non-coding and Protein-Coding Genes Suggest Their Role in Gene Expression Regulation
title_full_unstemmed GC-AG Introns Features in Long Non-coding and Protein-Coding Genes Suggest Their Role in Gene Expression Regulation
title_short GC-AG Introns Features in Long Non-coding and Protein-Coding Genes Suggest Their Role in Gene Expression Regulation
title_sort gc ag introns features in long non coding and protein coding genes suggest their role in gene expression regulation
topic GC-AG introns
long non-coding RNAs
splice junctions
first intron
alternative splicing
url https://www.frontiersin.org/article/10.3389/fgene.2020.00488/full
work_keys_str_mv AT monahaboualezz gcagintronsfeaturesinlongnoncodingandproteincodinggenessuggesttheirroleingeneexpressionregulation
AT ludovicacelli gcagintronsfeaturesinlongnoncodingandproteincodinggenessuggesttheirroleingeneexpressionregulation
AT giuliabelotti gcagintronsfeaturesinlongnoncodingandproteincodinggenessuggesttheirroleingeneexpressionregulation
AT antonellalisa gcagintronsfeaturesinlongnoncodingandproteincodinggenessuggesttheirroleingeneexpressionregulation
AT silviabione gcagintronsfeaturesinlongnoncodingandproteincodinggenessuggesttheirroleingeneexpressionregulation