RetroSnake: A modular pipeline to detect human endogenous retroviruses in genome sequencing data

Summary: Human endogenous retroviruses (HERVs) integrated into the human genome as a result of ancient exogenous infections and currently comprise ∼8% of our genome. The members of the most recently acquired HERV family, HERV-Ks, still retain the potential to produce viral molecules and have been li...

Full description

Bibliographic Details
Main Authors: Renata Kabiljo, Harry Bowles, Heather Marriott, Ashley R. Jones, Clement R. Bouton, Richard J.B. Dobson, John P. Quinn, Ahmad Al Khleifat, Chad M. Swanson, Ammar Al-Chalabi, Alfredo Iacoangeli
Format: Article
Language:English
Published: Elsevier 2022-11-01
Series:iScience
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2589004222015619
_version_ 1811192530686967808
author Renata Kabiljo
Harry Bowles
Heather Marriott
Ashley R. Jones
Clement R. Bouton
Richard J.B. Dobson
John P. Quinn
Ahmad Al Khleifat
Chad M. Swanson
Ammar Al-Chalabi
Alfredo Iacoangeli
author_facet Renata Kabiljo
Harry Bowles
Heather Marriott
Ashley R. Jones
Clement R. Bouton
Richard J.B. Dobson
John P. Quinn
Ahmad Al Khleifat
Chad M. Swanson
Ammar Al-Chalabi
Alfredo Iacoangeli
author_sort Renata Kabiljo
collection DOAJ
description Summary: Human endogenous retroviruses (HERVs) integrated into the human genome as a result of ancient exogenous infections and currently comprise ∼8% of our genome. The members of the most recently acquired HERV family, HERV-Ks, still retain the potential to produce viral molecules and have been linked to a wide range of diseases including cancer and neurodegeneration. Although a range of tools for HERV detection in NGS data exist, most of them lack wet lab validation and they do not cover all steps of the analysis. Here, we describe RetroSnake, an end-to-end, modular, computationally efficient, and customizable pipeline for the discovery of HERVs in short-read NGS data. RetroSnake is based on an extensively wet-lab validated protocol, it covers all steps of the analysis from raw data to the generation of annotated results presented as an interactive html file, and it is easy to use by life scientists without substantial computational training.Availability and implementation: The Pipeline and an extensive documentation are available on GitHub.
first_indexed 2024-04-11T23:54:08Z
format Article
id doaj.art-6fd85c92a4d0435a9afd0120daad3959
institution Directory Open Access Journal
issn 2589-0042
language English
last_indexed 2024-04-11T23:54:08Z
publishDate 2022-11-01
publisher Elsevier
record_format Article
series iScience
spelling doaj.art-6fd85c92a4d0435a9afd0120daad39592022-12-22T03:56:24ZengElsevieriScience2589-00422022-11-012511105289RetroSnake: A modular pipeline to detect human endogenous retroviruses in genome sequencing dataRenata Kabiljo0Harry Bowles1Heather Marriott2Ashley R. Jones3Clement R. Bouton4Richard J.B. Dobson5John P. Quinn6Ahmad Al Khleifat7Chad M. Swanson8Ammar Al-Chalabi9Alfredo Iacoangeli10Department of Biostatistics and Health Informatics, Institute of Psychiatry, Psychology & Neuroscience, King’s College London, London SE5 8AF, UK; Department of Basic and Clinical Neuroscience, Maurice Wohl Clinical Neuroscience Institute, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London SE5 9NU, UK; Corresponding authorDepartment of Biostatistics and Health Informatics, Institute of Psychiatry, Psychology & Neuroscience, King’s College London, London SE5 8AF, UK; Department of Basic and Clinical Neuroscience, Maurice Wohl Clinical Neuroscience Institute, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London SE5 9NU, UKDepartment of Biostatistics and Health Informatics, Institute of Psychiatry, Psychology & Neuroscience, King’s College London, London SE5 8AF, UK; Department of Basic and Clinical Neuroscience, Maurice Wohl Clinical Neuroscience Institute, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London SE5 9NU, UKDepartment of Basic and Clinical Neuroscience, Maurice Wohl Clinical Neuroscience Institute, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London SE5 9NU, UKDepartment of Infectious Diseases, School of Immunology and Microbial Sciences, King’s College London, London, UKDepartment of Biostatistics and Health Informatics, Institute of Psychiatry, Psychology & Neuroscience, King’s College London, London SE5 8AF, UK; NIHR Biomedical Research Centre at South London and Maudsley NHS Foundation Trust and King’s College London, London, UK; Institute of Health Informatics, University College London, London, UK; NIHR Biomedical Research Centre at University College London Hospitals NHS Foundation Trust, London, UKDepartment of Pharmacology and Therapeutics, Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 3BX, UKDepartment of Basic and Clinical Neuroscience, Maurice Wohl Clinical Neuroscience Institute, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London SE5 9NU, UKDepartment of Infectious Diseases, School of Immunology and Microbial Sciences, King’s College London, London, UKDepartment of Basic and Clinical Neuroscience, Maurice Wohl Clinical Neuroscience Institute, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London SE5 9NU, UKDepartment of Biostatistics and Health Informatics, Institute of Psychiatry, Psychology & Neuroscience, King’s College London, London SE5 8AF, UK; Department of Basic and Clinical Neuroscience, Maurice Wohl Clinical Neuroscience Institute, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London SE5 9NU, UK; NIHR Biomedical Research Centre at South London and Maudsley NHS Foundation Trust and King’s College London, London, UK; Corresponding authorSummary: Human endogenous retroviruses (HERVs) integrated into the human genome as a result of ancient exogenous infections and currently comprise ∼8% of our genome. The members of the most recently acquired HERV family, HERV-Ks, still retain the potential to produce viral molecules and have been linked to a wide range of diseases including cancer and neurodegeneration. Although a range of tools for HERV detection in NGS data exist, most of them lack wet lab validation and they do not cover all steps of the analysis. Here, we describe RetroSnake, an end-to-end, modular, computationally efficient, and customizable pipeline for the discovery of HERVs in short-read NGS data. RetroSnake is based on an extensively wet-lab validated protocol, it covers all steps of the analysis from raw data to the generation of annotated results presented as an interactive html file, and it is easy to use by life scientists without substantial computational training.Availability and implementation: The Pipeline and an extensive documentation are available on GitHub.http://www.sciencedirect.com/science/article/pii/S2589004222015619BioinformaticsBiocomputational methodSequence analysis
spellingShingle Renata Kabiljo
Harry Bowles
Heather Marriott
Ashley R. Jones
Clement R. Bouton
Richard J.B. Dobson
John P. Quinn
Ahmad Al Khleifat
Chad M. Swanson
Ammar Al-Chalabi
Alfredo Iacoangeli
RetroSnake: A modular pipeline to detect human endogenous retroviruses in genome sequencing data
iScience
Bioinformatics
Biocomputational method
Sequence analysis
title RetroSnake: A modular pipeline to detect human endogenous retroviruses in genome sequencing data
title_full RetroSnake: A modular pipeline to detect human endogenous retroviruses in genome sequencing data
title_fullStr RetroSnake: A modular pipeline to detect human endogenous retroviruses in genome sequencing data
title_full_unstemmed RetroSnake: A modular pipeline to detect human endogenous retroviruses in genome sequencing data
title_short RetroSnake: A modular pipeline to detect human endogenous retroviruses in genome sequencing data
title_sort retrosnake a modular pipeline to detect human endogenous retroviruses in genome sequencing data
topic Bioinformatics
Biocomputational method
Sequence analysis
url http://www.sciencedirect.com/science/article/pii/S2589004222015619
work_keys_str_mv AT renatakabiljo retrosnakeamodularpipelinetodetecthumanendogenousretrovirusesingenomesequencingdata
AT harrybowles retrosnakeamodularpipelinetodetecthumanendogenousretrovirusesingenomesequencingdata
AT heathermarriott retrosnakeamodularpipelinetodetecthumanendogenousretrovirusesingenomesequencingdata
AT ashleyrjones retrosnakeamodularpipelinetodetecthumanendogenousretrovirusesingenomesequencingdata
AT clementrbouton retrosnakeamodularpipelinetodetecthumanendogenousretrovirusesingenomesequencingdata
AT richardjbdobson retrosnakeamodularpipelinetodetecthumanendogenousretrovirusesingenomesequencingdata
AT johnpquinn retrosnakeamodularpipelinetodetecthumanendogenousretrovirusesingenomesequencingdata
AT ahmadalkhleifat retrosnakeamodularpipelinetodetecthumanendogenousretrovirusesingenomesequencingdata
AT chadmswanson retrosnakeamodularpipelinetodetecthumanendogenousretrovirusesingenomesequencingdata
AT ammaralchalabi retrosnakeamodularpipelinetodetecthumanendogenousretrovirusesingenomesequencingdata
AT alfredoiacoangeli retrosnakeamodularpipelinetodetecthumanendogenousretrovirusesingenomesequencingdata