Orthoptera-TElib: a library of Orthoptera transposable elements for TE annotation
Abstract Transposable elements (TEs) are a major component of eukaryotic genomes and are present in almost all eukaryotic organisms. TEs are highly dynamic between and within species, which significantly affects the general applicability of the TE databases. Orthoptera is the only known group in the...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2024-03-01
|
Series: | Mobile DNA |
Subjects: | |
Online Access: | https://doi.org/10.1186/s13100-024-00316-x |
_version_ | 1827315952273850368 |
---|---|
author | Xuanzeng Liu Lina Zhao Muhammad Majid Yuan Huang |
author_facet | Xuanzeng Liu Lina Zhao Muhammad Majid Yuan Huang |
author_sort | Xuanzeng Liu |
collection | DOAJ |
description | Abstract Transposable elements (TEs) are a major component of eukaryotic genomes and are present in almost all eukaryotic organisms. TEs are highly dynamic between and within species, which significantly affects the general applicability of the TE databases. Orthoptera is the only known group in the class Insecta with a significantly enlarged genome (0.93-21.48 Gb). When analyzing the large genome using the existing TE public database, the efficiency of TE annotation is not satisfactory. To address this limitation, it becomes imperative to continually update the available TE resource library and the need for an Orthoptera-specific library as more insect genomes are publicly available. Here, we used the complete genome data of 12 Orthoptera species to de novo annotate TEs, then manually re-annotate the unclassified TEs to construct a non-redundant Orthoptera-specific TE library: Orthoptera-TElib. Orthoptera-TElib contains 24,021 TE entries including the re-annotated results of 13,964 unknown TEs. The naming of TE entries in Orthoptera-TElib adopts the same naming as RepeatMasker and Dfam and is encoded as the three-level form of “level1/level2-level3”. Orthoptera-TElib can be directly used as an input reference database and is compatible with mainstream repetitive sequence analysis software such as RepeatMasker and dnaPipeTE. When analyzing TEs of Orthoptera species, Orthoptera-TElib performs better TE annotation as compared to Dfam and Repbase regardless of using low-coverage sequencing or genome assembly data. The most improved TE annotation result is Angaracris rhodopa, which has increased from 7.89% of the genome to 53.28%. Finally, Orthoptera-TElib is stored in Sqlite3 for the convenience of data updates and user access. |
first_indexed | 2024-04-24T23:05:33Z |
format | Article |
id | doaj.art-14fcdfaa751b4f0cbbfe995421fa14a5 |
institution | Directory Open Access Journal |
issn | 1759-8753 |
language | English |
last_indexed | 2024-04-24T23:05:33Z |
publishDate | 2024-03-01 |
publisher | BMC |
record_format | Article |
series | Mobile DNA |
spelling | doaj.art-14fcdfaa751b4f0cbbfe995421fa14a52024-03-17T12:28:42ZengBMCMobile DNA1759-87532024-03-0115111110.1186/s13100-024-00316-xOrthoptera-TElib: a library of Orthoptera transposable elements for TE annotationXuanzeng Liu0Lina Zhao1Muhammad Majid2Yuan Huang3College of Life Sciences, Shaanxi Normal UniversityCollege of Life Sciences, Shaanxi Normal UniversityCollege of Life Sciences, Shaanxi Normal UniversityCollege of Life Sciences, Shaanxi Normal UniversityAbstract Transposable elements (TEs) are a major component of eukaryotic genomes and are present in almost all eukaryotic organisms. TEs are highly dynamic between and within species, which significantly affects the general applicability of the TE databases. Orthoptera is the only known group in the class Insecta with a significantly enlarged genome (0.93-21.48 Gb). When analyzing the large genome using the existing TE public database, the efficiency of TE annotation is not satisfactory. To address this limitation, it becomes imperative to continually update the available TE resource library and the need for an Orthoptera-specific library as more insect genomes are publicly available. Here, we used the complete genome data of 12 Orthoptera species to de novo annotate TEs, then manually re-annotate the unclassified TEs to construct a non-redundant Orthoptera-specific TE library: Orthoptera-TElib. Orthoptera-TElib contains 24,021 TE entries including the re-annotated results of 13,964 unknown TEs. The naming of TE entries in Orthoptera-TElib adopts the same naming as RepeatMasker and Dfam and is encoded as the three-level form of “level1/level2-level3”. Orthoptera-TElib can be directly used as an input reference database and is compatible with mainstream repetitive sequence analysis software such as RepeatMasker and dnaPipeTE. When analyzing TEs of Orthoptera species, Orthoptera-TElib performs better TE annotation as compared to Dfam and Repbase regardless of using low-coverage sequencing or genome assembly data. The most improved TE annotation result is Angaracris rhodopa, which has increased from 7.89% of the genome to 53.28%. Finally, Orthoptera-TElib is stored in Sqlite3 for the convenience of data updates and user access.https://doi.org/10.1186/s13100-024-00316-xTransposable elementsOrthoptera genomeTE databaseDfam and repbaseDe novo annotation |
spellingShingle | Xuanzeng Liu Lina Zhao Muhammad Majid Yuan Huang Orthoptera-TElib: a library of Orthoptera transposable elements for TE annotation Mobile DNA Transposable elements Orthoptera genome TE database Dfam and repbase De novo annotation |
title | Orthoptera-TElib: a library of Orthoptera transposable elements for TE annotation |
title_full | Orthoptera-TElib: a library of Orthoptera transposable elements for TE annotation |
title_fullStr | Orthoptera-TElib: a library of Orthoptera transposable elements for TE annotation |
title_full_unstemmed | Orthoptera-TElib: a library of Orthoptera transposable elements for TE annotation |
title_short | Orthoptera-TElib: a library of Orthoptera transposable elements for TE annotation |
title_sort | orthoptera telib a library of orthoptera transposable elements for te annotation |
topic | Transposable elements Orthoptera genome TE database Dfam and repbase De novo annotation |
url | https://doi.org/10.1186/s13100-024-00316-x |
work_keys_str_mv | AT xuanzengliu orthopteratelibalibraryoforthopteratransposableelementsforteannotation AT linazhao orthopteratelibalibraryoforthopteratransposableelementsforteannotation AT muhammadmajid orthopteratelibalibraryoforthopteratransposableelementsforteannotation AT yuanhuang orthopteratelibalibraryoforthopteratransposableelementsforteannotation |