Transposable element finder (TEF): finding active transposable elements from next generation sequencing data

Abstract Background Detection of newly transposed events by transposable elements (TEs) from next generation sequence (NGS) data is difficult, due to their multiple distribution sites over the genome containing older TEs. The previously reported Transposon Insertion Finder (TIF) detects TE transposi...

Full description

Bibliographic Details
Main Authors: Akio Miyao, Utako Yamanouchi
Format: Article
Language:English
Published: BMC 2022-11-01
Series:BMC Bioinformatics
Subjects:
Online Access:https://doi.org/10.1186/s12859-022-05011-3
_version_ 1811303916024889344
author Akio Miyao
Utako Yamanouchi
author_facet Akio Miyao
Utako Yamanouchi
author_sort Akio Miyao
collection DOAJ
description Abstract Background Detection of newly transposed events by transposable elements (TEs) from next generation sequence (NGS) data is difficult, due to their multiple distribution sites over the genome containing older TEs. The previously reported Transposon Insertion Finder (TIF) detects TE transpositions on the reference genome from NGS short reads using end sequences of target TE. TIF requires the sequence of target TE and is not able to detect transpositions for TEs with an unknown sequence. Result The new algorithm Transposable Element Finder (TEF) enables the detection of TE transpositions, even for TEs with an unknown sequence. TEF is a finding tool of transposed TEs, in contrast to TIF as a detection tool of transposed sites for TEs with a known sequence. The transposition event is often accompanied with a target site duplication (TSD). Focusing on TSD, two algorithms to detect both ends of TE, TSDs and target sites are reported here. One is based on the grouping with TSDs and direct comparison of k-mers from NGS without similarity search. The other is based on the junction mapping of TE end sequence candidates. Both methods succeed to detect both ends and TSDs of known active TEs in several tests with rice, Arabidopsis and Drosophila data and discover several new TEs in new locations. PCR confirmed the detected transpositions of TEs in several test cases in rice. Conclusions TEF detects transposed TEs with TSDs as a result of TE transposition, sequences of both ends and their inserted positions of transposed TEs by direct comparison of NGS data between two samples. Genotypes of transpositions are verified by counting of junctions of head and tail, and non-insertion sequences in NGS reads. TEF is easy to run and independent of any TE library, which makes it useful to detect insertions from unknown TEs bypassed by common TE annotation pipelines.
first_indexed 2024-04-13T07:56:37Z
format Article
id doaj.art-2490800f02f74a24b385ef08ea9edecb
institution Directory Open Access Journal
issn 1471-2105
language English
last_indexed 2024-04-13T07:56:37Z
publishDate 2022-11-01
publisher BMC
record_format Article
series BMC Bioinformatics
spelling doaj.art-2490800f02f74a24b385ef08ea9edecb2022-12-22T02:55:23ZengBMCBMC Bioinformatics1471-21052022-11-0123111710.1186/s12859-022-05011-3Transposable element finder (TEF): finding active transposable elements from next generation sequencing dataAkio Miyao0Utako Yamanouchi1Institute of Crop Science, National Agriculture and Food Research OrganizationInstitute of Crop Science, National Agriculture and Food Research OrganizationAbstract Background Detection of newly transposed events by transposable elements (TEs) from next generation sequence (NGS) data is difficult, due to their multiple distribution sites over the genome containing older TEs. The previously reported Transposon Insertion Finder (TIF) detects TE transpositions on the reference genome from NGS short reads using end sequences of target TE. TIF requires the sequence of target TE and is not able to detect transpositions for TEs with an unknown sequence. Result The new algorithm Transposable Element Finder (TEF) enables the detection of TE transpositions, even for TEs with an unknown sequence. TEF is a finding tool of transposed TEs, in contrast to TIF as a detection tool of transposed sites for TEs with a known sequence. The transposition event is often accompanied with a target site duplication (TSD). Focusing on TSD, two algorithms to detect both ends of TE, TSDs and target sites are reported here. One is based on the grouping with TSDs and direct comparison of k-mers from NGS without similarity search. The other is based on the junction mapping of TE end sequence candidates. Both methods succeed to detect both ends and TSDs of known active TEs in several tests with rice, Arabidopsis and Drosophila data and discover several new TEs in new locations. PCR confirmed the detected transpositions of TEs in several test cases in rice. Conclusions TEF detects transposed TEs with TSDs as a result of TE transposition, sequences of both ends and their inserted positions of transposed TEs by direct comparison of NGS data between two samples. Genotypes of transpositions are verified by counting of junctions of head and tail, and non-insertion sequences in NGS reads. TEF is easy to run and independent of any TE library, which makes it useful to detect insertions from unknown TEs bypassed by common TE annotation pipelines.https://doi.org/10.1186/s12859-022-05011-3Transposable elementRetrotransposonNext generation sequenceTarget site duplicationTos17
spellingShingle Akio Miyao
Utako Yamanouchi
Transposable element finder (TEF): finding active transposable elements from next generation sequencing data
BMC Bioinformatics
Transposable element
Retrotransposon
Next generation sequence
Target site duplication
Tos17
title Transposable element finder (TEF): finding active transposable elements from next generation sequencing data
title_full Transposable element finder (TEF): finding active transposable elements from next generation sequencing data
title_fullStr Transposable element finder (TEF): finding active transposable elements from next generation sequencing data
title_full_unstemmed Transposable element finder (TEF): finding active transposable elements from next generation sequencing data
title_short Transposable element finder (TEF): finding active transposable elements from next generation sequencing data
title_sort transposable element finder tef finding active transposable elements from next generation sequencing data
topic Transposable element
Retrotransposon
Next generation sequence
Target site duplication
Tos17
url https://doi.org/10.1186/s12859-022-05011-3
work_keys_str_mv AT akiomiyao transposableelementfinderteffindingactivetransposableelementsfromnextgenerationsequencingdata
AT utakoyamanouchi transposableelementfinderteffindingactivetransposableelementsfromnextgenerationsequencingdata