Orthoptera-TElib: a library of Orthoptera transposable elements for TE annotation

Abstract Transposable elements (TEs) are a major component of eukaryotic genomes and are present in almost all eukaryotic organisms. TEs are highly dynamic between and within species, which significantly affects the general applicability of the TE databases. Orthoptera is the only known group in the...

Full description

Bibliographic Details
Main Authors: Xuanzeng Liu, Lina Zhao, Muhammad Majid, Yuan Huang
Format: Article
Language:English
Published: BMC 2024-03-01
Series:Mobile DNA
Subjects:
Online Access:https://doi.org/10.1186/s13100-024-00316-x
_version_ 1827315952273850368
author Xuanzeng Liu
Lina Zhao
Muhammad Majid
Yuan Huang
author_facet Xuanzeng Liu
Lina Zhao
Muhammad Majid
Yuan Huang
author_sort Xuanzeng Liu
collection DOAJ
description Abstract Transposable elements (TEs) are a major component of eukaryotic genomes and are present in almost all eukaryotic organisms. TEs are highly dynamic between and within species, which significantly affects the general applicability of the TE databases. Orthoptera is the only known group in the class Insecta with a significantly enlarged genome (0.93-21.48 Gb). When analyzing the large genome using the existing TE public database, the efficiency of TE annotation is not satisfactory. To address this limitation, it becomes imperative to continually update the available TE resource library and the need for an Orthoptera-specific library as more insect genomes are publicly available. Here, we used the complete genome data of 12 Orthoptera species to de novo annotate TEs, then manually re-annotate the unclassified TEs to construct a non-redundant Orthoptera-specific TE library: Orthoptera-TElib. Orthoptera-TElib contains 24,021 TE entries including the re-annotated results of 13,964 unknown TEs. The naming of TE entries in Orthoptera-TElib adopts the same naming as RepeatMasker and Dfam and is encoded as the three-level form of “level1/level2-level3”. Orthoptera-TElib can be directly used as an input reference database and is compatible with mainstream repetitive sequence analysis software such as RepeatMasker and dnaPipeTE. When analyzing TEs of Orthoptera species, Orthoptera-TElib performs better TE annotation as compared to Dfam and Repbase regardless of using low-coverage sequencing or genome assembly data. The most improved TE annotation result is Angaracris rhodopa, which has increased from 7.89% of the genome to 53.28%. Finally, Orthoptera-TElib is stored in Sqlite3 for the convenience of data updates and user access.
first_indexed 2024-04-24T23:05:33Z
format Article
id doaj.art-14fcdfaa751b4f0cbbfe995421fa14a5
institution Directory Open Access Journal
issn 1759-8753
language English
last_indexed 2024-04-24T23:05:33Z
publishDate 2024-03-01
publisher BMC
record_format Article
series Mobile DNA
spelling doaj.art-14fcdfaa751b4f0cbbfe995421fa14a52024-03-17T12:28:42ZengBMCMobile DNA1759-87532024-03-0115111110.1186/s13100-024-00316-xOrthoptera-TElib: a library of Orthoptera transposable elements for TE annotationXuanzeng Liu0Lina Zhao1Muhammad Majid2Yuan Huang3College of Life Sciences, Shaanxi Normal UniversityCollege of Life Sciences, Shaanxi Normal UniversityCollege of Life Sciences, Shaanxi Normal UniversityCollege of Life Sciences, Shaanxi Normal UniversityAbstract Transposable elements (TEs) are a major component of eukaryotic genomes and are present in almost all eukaryotic organisms. TEs are highly dynamic between and within species, which significantly affects the general applicability of the TE databases. Orthoptera is the only known group in the class Insecta with a significantly enlarged genome (0.93-21.48 Gb). When analyzing the large genome using the existing TE public database, the efficiency of TE annotation is not satisfactory. To address this limitation, it becomes imperative to continually update the available TE resource library and the need for an Orthoptera-specific library as more insect genomes are publicly available. Here, we used the complete genome data of 12 Orthoptera species to de novo annotate TEs, then manually re-annotate the unclassified TEs to construct a non-redundant Orthoptera-specific TE library: Orthoptera-TElib. Orthoptera-TElib contains 24,021 TE entries including the re-annotated results of 13,964 unknown TEs. The naming of TE entries in Orthoptera-TElib adopts the same naming as RepeatMasker and Dfam and is encoded as the three-level form of “level1/level2-level3”. Orthoptera-TElib can be directly used as an input reference database and is compatible with mainstream repetitive sequence analysis software such as RepeatMasker and dnaPipeTE. When analyzing TEs of Orthoptera species, Orthoptera-TElib performs better TE annotation as compared to Dfam and Repbase regardless of using low-coverage sequencing or genome assembly data. The most improved TE annotation result is Angaracris rhodopa, which has increased from 7.89% of the genome to 53.28%. Finally, Orthoptera-TElib is stored in Sqlite3 for the convenience of data updates and user access.https://doi.org/10.1186/s13100-024-00316-xTransposable elementsOrthoptera genomeTE databaseDfam and repbaseDe novo annotation
spellingShingle Xuanzeng Liu
Lina Zhao
Muhammad Majid
Yuan Huang
Orthoptera-TElib: a library of Orthoptera transposable elements for TE annotation
Mobile DNA
Transposable elements
Orthoptera genome
TE database
Dfam and repbase
De novo annotation
title Orthoptera-TElib: a library of Orthoptera transposable elements for TE annotation
title_full Orthoptera-TElib: a library of Orthoptera transposable elements for TE annotation
title_fullStr Orthoptera-TElib: a library of Orthoptera transposable elements for TE annotation
title_full_unstemmed Orthoptera-TElib: a library of Orthoptera transposable elements for TE annotation
title_short Orthoptera-TElib: a library of Orthoptera transposable elements for TE annotation
title_sort orthoptera telib a library of orthoptera transposable elements for te annotation
topic Transposable elements
Orthoptera genome
TE database
Dfam and repbase
De novo annotation
url https://doi.org/10.1186/s13100-024-00316-x
work_keys_str_mv AT xuanzengliu orthopteratelibalibraryoforthopteratransposableelementsforteannotation
AT linazhao orthopteratelibalibraryoforthopteratransposableelementsforteannotation
AT muhammadmajid orthopteratelibalibraryoforthopteratransposableelementsforteannotation
AT yuanhuang orthopteratelibalibraryoforthopteratransposableelementsforteannotation