Fake IDs? Widespread misannotation of DNA transposons as a general transcription factor
Abstract Accurate annotation of genes and transposable elements (TEs) is vital for understanding genomes, but current annotation pipelines often misannotate TEs as genes. This study reveals how the general transcription factor II-I repeat domain-containing protein 2 (GTF2IRD2) erroneously annotated...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2023-11-01
|
Series: | Genome Biology |
Subjects: | |
Online Access: | https://doi.org/10.1186/s13059-023-03102-9 |
_version_ | 1797559373641285632 |
---|---|
author | Nozhat T. Hassan David L. Adelson |
author_facet | Nozhat T. Hassan David L. Adelson |
author_sort | Nozhat T. Hassan |
collection | DOAJ |
description | Abstract Accurate annotation of genes and transposable elements (TEs) is vital for understanding genomes, but current annotation pipelines often misannotate TEs as genes. This study reveals how the general transcription factor II-I repeat domain-containing protein 2 (GTF2IRD2) erroneously annotated DNA transposons in non-mammalian species, as it contains a 3′ fused hAT transposase domain. We also demonstrate the generality of this problem by identifying misannotated TEs as genes in other vertebrate genomes. Such misannotations can lead to errors in phylogenetic analyses and wasted time for investigators. The study proposes adding a final TE-check to gene annotation pipelines to mitigate this problem. |
first_indexed | 2024-03-10T17:44:27Z |
format | Article |
id | doaj.art-ffd46fdf531d4ed292c4d4953c9cda7c |
institution | Directory Open Access Journal |
issn | 1474-760X |
language | English |
last_indexed | 2024-03-10T17:44:27Z |
publishDate | 2023-11-01 |
publisher | BMC |
record_format | Article |
series | Genome Biology |
spelling | doaj.art-ffd46fdf531d4ed292c4d4953c9cda7c2023-11-20T09:35:21ZengBMCGenome Biology1474-760X2023-11-012411810.1186/s13059-023-03102-9Fake IDs? Widespread misannotation of DNA transposons as a general transcription factorNozhat T. Hassan0David L. Adelson1School of Biological Sciences, University of AdelaideSchool of Biological Sciences, University of AdelaideAbstract Accurate annotation of genes and transposable elements (TEs) is vital for understanding genomes, but current annotation pipelines often misannotate TEs as genes. This study reveals how the general transcription factor II-I repeat domain-containing protein 2 (GTF2IRD2) erroneously annotated DNA transposons in non-mammalian species, as it contains a 3′ fused hAT transposase domain. We also demonstrate the generality of this problem by identifying misannotated TEs as genes in other vertebrate genomes. Such misannotations can lead to errors in phylogenetic analyses and wasted time for investigators. The study proposes adding a final TE-check to gene annotation pipelines to mitigate this problem.https://doi.org/10.1186/s13059-023-03102-9Transposable elementGenomeAnnotationTranscription factorGTF2DNA transposon |
spellingShingle | Nozhat T. Hassan David L. Adelson Fake IDs? Widespread misannotation of DNA transposons as a general transcription factor Genome Biology Transposable element Genome Annotation Transcription factor GTF2 DNA transposon |
title | Fake IDs? Widespread misannotation of DNA transposons as a general transcription factor |
title_full | Fake IDs? Widespread misannotation of DNA transposons as a general transcription factor |
title_fullStr | Fake IDs? Widespread misannotation of DNA transposons as a general transcription factor |
title_full_unstemmed | Fake IDs? Widespread misannotation of DNA transposons as a general transcription factor |
title_short | Fake IDs? Widespread misannotation of DNA transposons as a general transcription factor |
title_sort | fake ids widespread misannotation of dna transposons as a general transcription factor |
topic | Transposable element Genome Annotation Transcription factor GTF2 DNA transposon |
url | https://doi.org/10.1186/s13059-023-03102-9 |
work_keys_str_mv | AT nozhatthassan fakeidswidespreadmisannotationofdnatransposonsasageneraltranscriptionfactor AT davidladelson fakeidswidespreadmisannotationofdnatransposonsasageneraltranscriptionfactor |