PathoLive—Real-Time Pathogen Identification from Metagenomic Illumina Datasets
Over the past years, NGS has become a crucial workhorse for open-view pathogen diagnostics. Yet, long turnaround times result from using massively parallel high-throughput technologies as the analysis can only be performed after sequencing has finished. The interpretation of results can further be c...
Main Authors: | , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2022-08-01
|
Series: | Life |
Subjects: | |
Online Access: | https://www.mdpi.com/2075-1729/12/9/1345 |
_version_ | 1797485761036025856 |
---|---|
author | Simon H. Tausch Tobias P. Loka Jakob M. Schulze Andreas Andrusch Jeanette Klenner Piotr Wojciech Dabrowski Martin S. Lindner Andreas Nitsche Bernhard Y. Renard |
author_facet | Simon H. Tausch Tobias P. Loka Jakob M. Schulze Andreas Andrusch Jeanette Klenner Piotr Wojciech Dabrowski Martin S. Lindner Andreas Nitsche Bernhard Y. Renard |
author_sort | Simon H. Tausch |
collection | DOAJ |
description | Over the past years, NGS has become a crucial workhorse for open-view pathogen diagnostics. Yet, long turnaround times result from using massively parallel high-throughput technologies as the analysis can only be performed after sequencing has finished. The interpretation of results can further be challenged by contaminations, clinically irrelevant sequences, and the sheer amount and complexity of the data. We implemented PathoLive, a real-time diagnostics pipeline for the detection of pathogens from clinical samples hours before sequencing has finished. Based on real-time alignment with HiLive2, mappings are scored with respect to common contaminations, low-entropy areas, and sequences of widespread, non-pathogenic organisms. The results are visualized using an interactive taxonomic tree that provides an easily interpretable overview of the relevance of hits. For a human plasma sample that was spiked in vitro with six pathogenic viruses, all agents were clearly detected after only 40 of 200 sequencing cycles. For a real-world sample from Sudan, the results correctly indicated the presence of Crimean-Congo hemorrhagic fever virus. In a second real-world dataset from the 2019 SARS-CoV-2 outbreak in Wuhan, we found the presence of a SARS coronavirus as the most relevant hit without the novel virus reference genome being included in the database. For all samples, clinically irrelevant hits were correctly de-emphasized. Our approach is valuable to obtain fast and accurate NGS-based pathogen identifications and correctly prioritize and visualize them based on their clinical significance: PathoLive is open source and available on GitLab and BioConda. |
first_indexed | 2024-03-09T23:23:26Z |
format | Article |
id | doaj.art-71a834c2edc44da89810b9999b97ff6f |
institution | Directory Open Access Journal |
issn | 2075-1729 |
language | English |
last_indexed | 2024-03-09T23:23:26Z |
publishDate | 2022-08-01 |
publisher | MDPI AG |
record_format | Article |
series | Life |
spelling | doaj.art-71a834c2edc44da89810b9999b97ff6f2023-11-23T17:22:32ZengMDPI AGLife2075-17292022-08-01129134510.3390/life12091345PathoLive—Real-Time Pathogen Identification from Metagenomic Illumina DatasetsSimon H. Tausch0Tobias P. Loka1Jakob M. Schulze2Andreas Andrusch3Jeanette Klenner4Piotr Wojciech Dabrowski5Martin S. Lindner6Andreas Nitsche7Bernhard Y. Renard8National Study Centre for Sequencing in Risk Assessment, Department Biological Safety, German Federal Institute for Risk Assessment, 10589 Berlin, GermanyBioinformatics Division (MF 1), Department for Methods Development and Research Infrastructure, Robert Koch Institute, 13353 Berlin, GermanyBioinformatics Division (MF 1), Department for Methods Development and Research Infrastructure, Robert Koch Institute, 13353 Berlin, GermanyBioinformatics Division (MF 1), Department for Methods Development and Research Infrastructure, Robert Koch Institute, 13353 Berlin, GermanyCentre for Biological Threats and Special Pathogens, Highly Pathogenic Viruses (ZBS 1), 13353 Berlin, GermanyBioinformatics Division (MF 1), Department for Methods Development and Research Infrastructure, Robert Koch Institute, 13353 Berlin, GermanyBioinformatics Division (MF 1), Department for Methods Development and Research Infrastructure, Robert Koch Institute, 13353 Berlin, GermanyCentre for Biological Threats and Special Pathogens, Highly Pathogenic Viruses (ZBS 1), 13353 Berlin, GermanyBioinformatics Division (MF 1), Department for Methods Development and Research Infrastructure, Robert Koch Institute, 13353 Berlin, GermanyOver the past years, NGS has become a crucial workhorse for open-view pathogen diagnostics. Yet, long turnaround times result from using massively parallel high-throughput technologies as the analysis can only be performed after sequencing has finished. The interpretation of results can further be challenged by contaminations, clinically irrelevant sequences, and the sheer amount and complexity of the data. We implemented PathoLive, a real-time diagnostics pipeline for the detection of pathogens from clinical samples hours before sequencing has finished. Based on real-time alignment with HiLive2, mappings are scored with respect to common contaminations, low-entropy areas, and sequences of widespread, non-pathogenic organisms. The results are visualized using an interactive taxonomic tree that provides an easily interpretable overview of the relevance of hits. For a human plasma sample that was spiked in vitro with six pathogenic viruses, all agents were clearly detected after only 40 of 200 sequencing cycles. For a real-world sample from Sudan, the results correctly indicated the presence of Crimean-Congo hemorrhagic fever virus. In a second real-world dataset from the 2019 SARS-CoV-2 outbreak in Wuhan, we found the presence of a SARS coronavirus as the most relevant hit without the novel virus reference genome being included in the database. For all samples, clinically irrelevant hits were correctly de-emphasized. Our approach is valuable to obtain fast and accurate NGS-based pathogen identifications and correctly prioritize and visualize them based on their clinical significance: PathoLive is open source and available on GitLab and BioConda.https://www.mdpi.com/2075-1729/12/9/1345NGSmetagenomicsvirusesinfectious diseasesdiagnosticslive sequencing |
spellingShingle | Simon H. Tausch Tobias P. Loka Jakob M. Schulze Andreas Andrusch Jeanette Klenner Piotr Wojciech Dabrowski Martin S. Lindner Andreas Nitsche Bernhard Y. Renard PathoLive—Real-Time Pathogen Identification from Metagenomic Illumina Datasets Life NGS metagenomics viruses infectious diseases diagnostics live sequencing |
title | PathoLive—Real-Time Pathogen Identification from Metagenomic Illumina Datasets |
title_full | PathoLive—Real-Time Pathogen Identification from Metagenomic Illumina Datasets |
title_fullStr | PathoLive—Real-Time Pathogen Identification from Metagenomic Illumina Datasets |
title_full_unstemmed | PathoLive—Real-Time Pathogen Identification from Metagenomic Illumina Datasets |
title_short | PathoLive—Real-Time Pathogen Identification from Metagenomic Illumina Datasets |
title_sort | patholive real time pathogen identification from metagenomic illumina datasets |
topic | NGS metagenomics viruses infectious diseases diagnostics live sequencing |
url | https://www.mdpi.com/2075-1729/12/9/1345 |
work_keys_str_mv | AT simonhtausch patholiverealtimepathogenidentificationfrommetagenomicilluminadatasets AT tobiasploka patholiverealtimepathogenidentificationfrommetagenomicilluminadatasets AT jakobmschulze patholiverealtimepathogenidentificationfrommetagenomicilluminadatasets AT andreasandrusch patholiverealtimepathogenidentificationfrommetagenomicilluminadatasets AT jeanetteklenner patholiverealtimepathogenidentificationfrommetagenomicilluminadatasets AT piotrwojciechdabrowski patholiverealtimepathogenidentificationfrommetagenomicilluminadatasets AT martinslindner patholiverealtimepathogenidentificationfrommetagenomicilluminadatasets AT andreasnitsche patholiverealtimepathogenidentificationfrommetagenomicilluminadatasets AT bernhardyrenard patholiverealtimepathogenidentificationfrommetagenomicilluminadatasets |