Fake IDs? Widespread misannotation of DNA transposons as a general transcription factor

Abstract Accurate annotation of genes and transposable elements (TEs) is vital for understanding genomes, but current annotation pipelines often misannotate TEs as genes. This study reveals how the general transcription factor II-I repeat domain-containing protein 2 (GTF2IRD2) erroneously annotated...

Full description

Bibliographic Details
Main Authors: Nozhat T. Hassan, David L. Adelson
Format: Article
Language:English
Published: BMC 2023-11-01
Series:Genome Biology
Subjects:
Online Access:https://doi.org/10.1186/s13059-023-03102-9
Description
Summary:Abstract Accurate annotation of genes and transposable elements (TEs) is vital for understanding genomes, but current annotation pipelines often misannotate TEs as genes. This study reveals how the general transcription factor II-I repeat domain-containing protein 2 (GTF2IRD2) erroneously annotated DNA transposons in non-mammalian species, as it contains a 3′ fused hAT transposase domain. We also demonstrate the generality of this problem by identifying misannotated TEs as genes in other vertebrate genomes. Such misannotations can lead to errors in phylogenetic analyses and wasted time for investigators. The study proposes adding a final TE-check to gene annotation pipelines to mitigate this problem.
ISSN:1474-760X