The UCSC repeat browser allows discovery and visualization of evolutionary conflict across repeat families

Abstract Background Nearly half the human genome consists of repeat elements, most of which are retrotransposons, and many of which play important biological roles. However repeat elements pose several unique challenges to current bioinformatic analyses and visualization tools, as short repeat seque...

Full description

Bibliographic Details
Main Authors: Jason D. Fernandes, Armando Zamudio-Hurtado, Hiram Clawson, W. James Kent, David Haussler, Sofie R. Salama, Maximilian Haeussler
Format: Article
Language:English
Published: BMC 2020-03-01
Series:Mobile DNA
Subjects:
Online Access:http://link.springer.com/article/10.1186/s13100-020-00208-w
_version_ 1818851804670066688
author Jason D. Fernandes
Armando Zamudio-Hurtado
Hiram Clawson
W. James Kent
David Haussler
Sofie R. Salama
Maximilian Haeussler
author_facet Jason D. Fernandes
Armando Zamudio-Hurtado
Hiram Clawson
W. James Kent
David Haussler
Sofie R. Salama
Maximilian Haeussler
author_sort Jason D. Fernandes
collection DOAJ
description Abstract Background Nearly half the human genome consists of repeat elements, most of which are retrotransposons, and many of which play important biological roles. However repeat elements pose several unique challenges to current bioinformatic analyses and visualization tools, as short repeat sequences can map to multiple genomic loci resulting in their misclassification and misinterpretation. In fact, sequence data mapping to repeat elements are often discarded from analysis pipelines. Therefore, there is a continued need for standardized tools and techniques to interpret genomic data of repeats. Results We present the UCSC Repeat Browser, which consists of a complete set of human repeat reference sequences derived from annotations made by the commonly used program RepeatMasker. The UCSC Repeat Browser also provides an alignment from the human genome to these references, uses it to map the standard human genome annotation tracks, and presents all of them as a comprehensive interface to facilitate work with repetitive elements. It also provides processed tracks of multiple publicly available datasets of particular interest to the repeat community, including ChIP-seq datasets for KRAB Zinc Finger Proteins (KZNFs) – a family of proteins known to bind and repress certain classes of repeats. We used the UCSC Repeat Browser in combination with these datasets, as well as RepeatMasker annotations in several non-human primates, to trace the independent trajectories of species-specific evolutionary battles between LINE 1 retroelements and their repressors. Furthermore, we document at https://repeatbrowser.ucsc.edu how researchers can map their own human genome annotations to these reference repeat sequences. Conclusions The UCSC Repeat Browser allows easy and intuitive visualization of genomic data on consensus repeat elements, circumventing the problem of multi-mapping, in which sequencing reads of repeat elements map to multiple locations on the human genome. By developing a reference consensus, multiple datasets and annotation tracks can easily be overlaid to reveal complex evolutionary histories of repeats in a single interactive window. Specifically, we use this approach to retrace the history of several primate specific LINE-1 families across apes, and discover several species-specific routes of evolution that correlate with the emergence and binding of KZNFs.
first_indexed 2024-12-19T07:10:51Z
format Article
id doaj.art-57d4074818574439ab6825cf48f69a28
institution Directory Open Access Journal
issn 1759-8753
language English
last_indexed 2024-12-19T07:10:51Z
publishDate 2020-03-01
publisher BMC
record_format Article
series Mobile DNA
spelling doaj.art-57d4074818574439ab6825cf48f69a282022-12-21T20:31:11ZengBMCMobile DNA1759-87532020-03-0111111210.1186/s13100-020-00208-wThe UCSC repeat browser allows discovery and visualization of evolutionary conflict across repeat familiesJason D. Fernandes0Armando Zamudio-Hurtado1Hiram Clawson2W. James Kent3David Haussler4Sofie R. Salama5Maximilian Haeussler6Genomics Institute, University of CaliforniaGenomics Institute, University of CaliforniaGenomics Institute, University of CaliforniaGenomics Institute, University of CaliforniaGenomics Institute, University of CaliforniaGenomics Institute, University of CaliforniaGenomics Institute, University of CaliforniaAbstract Background Nearly half the human genome consists of repeat elements, most of which are retrotransposons, and many of which play important biological roles. However repeat elements pose several unique challenges to current bioinformatic analyses and visualization tools, as short repeat sequences can map to multiple genomic loci resulting in their misclassification and misinterpretation. In fact, sequence data mapping to repeat elements are often discarded from analysis pipelines. Therefore, there is a continued need for standardized tools and techniques to interpret genomic data of repeats. Results We present the UCSC Repeat Browser, which consists of a complete set of human repeat reference sequences derived from annotations made by the commonly used program RepeatMasker. The UCSC Repeat Browser also provides an alignment from the human genome to these references, uses it to map the standard human genome annotation tracks, and presents all of them as a comprehensive interface to facilitate work with repetitive elements. It also provides processed tracks of multiple publicly available datasets of particular interest to the repeat community, including ChIP-seq datasets for KRAB Zinc Finger Proteins (KZNFs) – a family of proteins known to bind and repress certain classes of repeats. We used the UCSC Repeat Browser in combination with these datasets, as well as RepeatMasker annotations in several non-human primates, to trace the independent trajectories of species-specific evolutionary battles between LINE 1 retroelements and their repressors. Furthermore, we document at https://repeatbrowser.ucsc.edu how researchers can map their own human genome annotations to these reference repeat sequences. Conclusions The UCSC Repeat Browser allows easy and intuitive visualization of genomic data on consensus repeat elements, circumventing the problem of multi-mapping, in which sequencing reads of repeat elements map to multiple locations on the human genome. By developing a reference consensus, multiple datasets and annotation tracks can easily be overlaid to reveal complex evolutionary histories of repeats in a single interactive window. Specifically, we use this approach to retrace the history of several primate specific LINE-1 families across apes, and discover several species-specific routes of evolution that correlate with the emergence and binding of KZNFs.http://link.springer.com/article/10.1186/s13100-020-00208-wRepeatsRetrotransposonGenomicsKrab zinc finger proteinsEvolution
spellingShingle Jason D. Fernandes
Armando Zamudio-Hurtado
Hiram Clawson
W. James Kent
David Haussler
Sofie R. Salama
Maximilian Haeussler
The UCSC repeat browser allows discovery and visualization of evolutionary conflict across repeat families
Mobile DNA
Repeats
Retrotransposon
Genomics
Krab zinc finger proteins
Evolution
title The UCSC repeat browser allows discovery and visualization of evolutionary conflict across repeat families
title_full The UCSC repeat browser allows discovery and visualization of evolutionary conflict across repeat families
title_fullStr The UCSC repeat browser allows discovery and visualization of evolutionary conflict across repeat families
title_full_unstemmed The UCSC repeat browser allows discovery and visualization of evolutionary conflict across repeat families
title_short The UCSC repeat browser allows discovery and visualization of evolutionary conflict across repeat families
title_sort ucsc repeat browser allows discovery and visualization of evolutionary conflict across repeat families
topic Repeats
Retrotransposon
Genomics
Krab zinc finger proteins
Evolution
url http://link.springer.com/article/10.1186/s13100-020-00208-w
work_keys_str_mv AT jasondfernandes theucscrepeatbrowserallowsdiscoveryandvisualizationofevolutionaryconflictacrossrepeatfamilies
AT armandozamudiohurtado theucscrepeatbrowserallowsdiscoveryandvisualizationofevolutionaryconflictacrossrepeatfamilies
AT hiramclawson theucscrepeatbrowserallowsdiscoveryandvisualizationofevolutionaryconflictacrossrepeatfamilies
AT wjameskent theucscrepeatbrowserallowsdiscoveryandvisualizationofevolutionaryconflictacrossrepeatfamilies
AT davidhaussler theucscrepeatbrowserallowsdiscoveryandvisualizationofevolutionaryconflictacrossrepeatfamilies
AT sofiersalama theucscrepeatbrowserallowsdiscoveryandvisualizationofevolutionaryconflictacrossrepeatfamilies
AT maximilianhaeussler theucscrepeatbrowserallowsdiscoveryandvisualizationofevolutionaryconflictacrossrepeatfamilies
AT jasondfernandes ucscrepeatbrowserallowsdiscoveryandvisualizationofevolutionaryconflictacrossrepeatfamilies
AT armandozamudiohurtado ucscrepeatbrowserallowsdiscoveryandvisualizationofevolutionaryconflictacrossrepeatfamilies
AT hiramclawson ucscrepeatbrowserallowsdiscoveryandvisualizationofevolutionaryconflictacrossrepeatfamilies
AT wjameskent ucscrepeatbrowserallowsdiscoveryandvisualizationofevolutionaryconflictacrossrepeatfamilies
AT davidhaussler ucscrepeatbrowserallowsdiscoveryandvisualizationofevolutionaryconflictacrossrepeatfamilies
AT sofiersalama ucscrepeatbrowserallowsdiscoveryandvisualizationofevolutionaryconflictacrossrepeatfamilies
AT maximilianhaeussler ucscrepeatbrowserallowsdiscoveryandvisualizationofevolutionaryconflictacrossrepeatfamilies