PathoLive—Real-Time Pathogen Identification from Metagenomic Illumina Datasets

Over the past years, NGS has become a crucial workhorse for open-view pathogen diagnostics. Yet, long turnaround times result from using massively parallel high-throughput technologies as the analysis can only be performed after sequencing has finished. The interpretation of results can further be c...

Full description

Bibliographic Details
Main Authors: Simon H. Tausch, Tobias P. Loka, Jakob M. Schulze, Andreas Andrusch, Jeanette Klenner, Piotr Wojciech Dabrowski, Martin S. Lindner, Andreas Nitsche, Bernhard Y. Renard
Format: Article
Language:English
Published: MDPI AG 2022-08-01
Series:Life
Subjects:
Online Access:https://www.mdpi.com/2075-1729/12/9/1345
_version_ 1797485761036025856
author Simon H. Tausch
Tobias P. Loka
Jakob M. Schulze
Andreas Andrusch
Jeanette Klenner
Piotr Wojciech Dabrowski
Martin S. Lindner
Andreas Nitsche
Bernhard Y. Renard
author_facet Simon H. Tausch
Tobias P. Loka
Jakob M. Schulze
Andreas Andrusch
Jeanette Klenner
Piotr Wojciech Dabrowski
Martin S. Lindner
Andreas Nitsche
Bernhard Y. Renard
author_sort Simon H. Tausch
collection DOAJ
description Over the past years, NGS has become a crucial workhorse for open-view pathogen diagnostics. Yet, long turnaround times result from using massively parallel high-throughput technologies as the analysis can only be performed after sequencing has finished. The interpretation of results can further be challenged by contaminations, clinically irrelevant sequences, and the sheer amount and complexity of the data. We implemented PathoLive, a real-time diagnostics pipeline for the detection of pathogens from clinical samples hours before sequencing has finished. Based on real-time alignment with HiLive2, mappings are scored with respect to common contaminations, low-entropy areas, and sequences of widespread, non-pathogenic organisms. The results are visualized using an interactive taxonomic tree that provides an easily interpretable overview of the relevance of hits. For a human plasma sample that was spiked in vitro with six pathogenic viruses, all agents were clearly detected after only 40 of 200 sequencing cycles. For a real-world sample from Sudan, the results correctly indicated the presence of Crimean-Congo hemorrhagic fever virus. In a second real-world dataset from the 2019 SARS-CoV-2 outbreak in Wuhan, we found the presence of a SARS coronavirus as the most relevant hit without the novel virus reference genome being included in the database. For all samples, clinically irrelevant hits were correctly de-emphasized. Our approach is valuable to obtain fast and accurate NGS-based pathogen identifications and correctly prioritize and visualize them based on their clinical significance: PathoLive is open source and available on GitLab and BioConda.
first_indexed 2024-03-09T23:23:26Z
format Article
id doaj.art-71a834c2edc44da89810b9999b97ff6f
institution Directory Open Access Journal
issn 2075-1729
language English
last_indexed 2024-03-09T23:23:26Z
publishDate 2022-08-01
publisher MDPI AG
record_format Article
series Life
spelling doaj.art-71a834c2edc44da89810b9999b97ff6f2023-11-23T17:22:32ZengMDPI AGLife2075-17292022-08-01129134510.3390/life12091345PathoLive—Real-Time Pathogen Identification from Metagenomic Illumina DatasetsSimon H. Tausch0Tobias P. Loka1Jakob M. Schulze2Andreas Andrusch3Jeanette Klenner4Piotr Wojciech Dabrowski5Martin S. Lindner6Andreas Nitsche7Bernhard Y. Renard8National Study Centre for Sequencing in Risk Assessment, Department Biological Safety, German Federal Institute for Risk Assessment, 10589 Berlin, GermanyBioinformatics Division (MF 1), Department for Methods Development and Research Infrastructure, Robert Koch Institute, 13353 Berlin, GermanyBioinformatics Division (MF 1), Department for Methods Development and Research Infrastructure, Robert Koch Institute, 13353 Berlin, GermanyBioinformatics Division (MF 1), Department for Methods Development and Research Infrastructure, Robert Koch Institute, 13353 Berlin, GermanyCentre for Biological Threats and Special Pathogens, Highly Pathogenic Viruses (ZBS 1), 13353 Berlin, GermanyBioinformatics Division (MF 1), Department for Methods Development and Research Infrastructure, Robert Koch Institute, 13353 Berlin, GermanyBioinformatics Division (MF 1), Department for Methods Development and Research Infrastructure, Robert Koch Institute, 13353 Berlin, GermanyCentre for Biological Threats and Special Pathogens, Highly Pathogenic Viruses (ZBS 1), 13353 Berlin, GermanyBioinformatics Division (MF 1), Department for Methods Development and Research Infrastructure, Robert Koch Institute, 13353 Berlin, GermanyOver the past years, NGS has become a crucial workhorse for open-view pathogen diagnostics. Yet, long turnaround times result from using massively parallel high-throughput technologies as the analysis can only be performed after sequencing has finished. The interpretation of results can further be challenged by contaminations, clinically irrelevant sequences, and the sheer amount and complexity of the data. We implemented PathoLive, a real-time diagnostics pipeline for the detection of pathogens from clinical samples hours before sequencing has finished. Based on real-time alignment with HiLive2, mappings are scored with respect to common contaminations, low-entropy areas, and sequences of widespread, non-pathogenic organisms. The results are visualized using an interactive taxonomic tree that provides an easily interpretable overview of the relevance of hits. For a human plasma sample that was spiked in vitro with six pathogenic viruses, all agents were clearly detected after only 40 of 200 sequencing cycles. For a real-world sample from Sudan, the results correctly indicated the presence of Crimean-Congo hemorrhagic fever virus. In a second real-world dataset from the 2019 SARS-CoV-2 outbreak in Wuhan, we found the presence of a SARS coronavirus as the most relevant hit without the novel virus reference genome being included in the database. For all samples, clinically irrelevant hits were correctly de-emphasized. Our approach is valuable to obtain fast and accurate NGS-based pathogen identifications and correctly prioritize and visualize them based on their clinical significance: PathoLive is open source and available on GitLab and BioConda.https://www.mdpi.com/2075-1729/12/9/1345NGSmetagenomicsvirusesinfectious diseasesdiagnosticslive sequencing
spellingShingle Simon H. Tausch
Tobias P. Loka
Jakob M. Schulze
Andreas Andrusch
Jeanette Klenner
Piotr Wojciech Dabrowski
Martin S. Lindner
Andreas Nitsche
Bernhard Y. Renard
PathoLive—Real-Time Pathogen Identification from Metagenomic Illumina Datasets
Life
NGS
metagenomics
viruses
infectious diseases
diagnostics
live sequencing
title PathoLive—Real-Time Pathogen Identification from Metagenomic Illumina Datasets
title_full PathoLive—Real-Time Pathogen Identification from Metagenomic Illumina Datasets
title_fullStr PathoLive—Real-Time Pathogen Identification from Metagenomic Illumina Datasets
title_full_unstemmed PathoLive—Real-Time Pathogen Identification from Metagenomic Illumina Datasets
title_short PathoLive—Real-Time Pathogen Identification from Metagenomic Illumina Datasets
title_sort patholive real time pathogen identification from metagenomic illumina datasets
topic NGS
metagenomics
viruses
infectious diseases
diagnostics
live sequencing
url https://www.mdpi.com/2075-1729/12/9/1345
work_keys_str_mv AT simonhtausch patholiverealtimepathogenidentificationfrommetagenomicilluminadatasets
AT tobiasploka patholiverealtimepathogenidentificationfrommetagenomicilluminadatasets
AT jakobmschulze patholiverealtimepathogenidentificationfrommetagenomicilluminadatasets
AT andreasandrusch patholiverealtimepathogenidentificationfrommetagenomicilluminadatasets
AT jeanetteklenner patholiverealtimepathogenidentificationfrommetagenomicilluminadatasets
AT piotrwojciechdabrowski patholiverealtimepathogenidentificationfrommetagenomicilluminadatasets
AT martinslindner patholiverealtimepathogenidentificationfrommetagenomicilluminadatasets
AT andreasnitsche patholiverealtimepathogenidentificationfrommetagenomicilluminadatasets
AT bernhardyrenard patholiverealtimepathogenidentificationfrommetagenomicilluminadatasets