Scalable Pathogen Pipeline Platform (SP^3): enabling unified genomic data analysis with elastic cloud computing

Pathogen genomic data analysis can be extremely bespoke and diverse. This paper presents our plan and progress towards creating a Scalable Pathogen Pipeline Platform (SP3) providing an efficient and unified process of collecting, analysing and comparing genomic data analysis with the benefit of elas...

Full description

Bibliographic Details
Main Authors: Yang-Turner, F, Volk, D, Fowler, P, Swann, J, Bull, M, Hoosdally, S, Connor, T, Peto, T, Crook, D
Format: Conference item
Published: IEEE Digital Library 2019
_version_ 1826298823390003200
author Yang-Turner, F
Volk, D
Fowler, P
Swann, J
Bull, M
Hoosdally, S
Connor, T
Peto, T
Crook, D
author_facet Yang-Turner, F
Volk, D
Fowler, P
Swann, J
Bull, M
Hoosdally, S
Connor, T
Peto, T
Crook, D
author_sort Yang-Turner, F
collection OXFORD
description Pathogen genomic data analysis can be extremely bespoke and diverse. This paper presents our plan and progress towards creating a Scalable Pathogen Pipeline Platform (SP3) providing an efficient and unified process of collecting, analysing and comparing genomic data analysis with the benefit of elastic cloud computing. SP3 enables container-centric bioinformatic workflows run on personal computers, High-performance computing (HPC) clusters and cloud platforms. We have deployed and tested SP3 on local HPC, Google Cloud Platform (GCP), Microsoft Azure and OpenStack Platforms. SP3 allows users to fetch genomic sequencing data from European Nucleotide Archive (ENA) and conduct analysis with open-source bioinformatic pipelines. We believe SP3 will promote common standards around pathogen genomic data quality, data processing and data analysis, helping answer the challenges of tools divergence and leveraging a pool of public genomic data repository and cloud resources.
first_indexed 2024-03-07T04:52:41Z
format Conference item
id oxford-uuid:d581fd3c-d772-4556-a281-aa91f70942cb
institution University of Oxford
last_indexed 2024-03-07T04:52:41Z
publishDate 2019
publisher IEEE Digital Library
record_format dspace
spelling oxford-uuid:d581fd3c-d772-4556-a281-aa91f70942cb2022-03-27T08:26:29ZScalable Pathogen Pipeline Platform (SP^3): enabling unified genomic data analysis with elastic cloud computingConference itemhttp://purl.org/coar/resource_type/c_5794uuid:d581fd3c-d772-4556-a281-aa91f70942cbSymplectic Elements at OxfordIEEE Digital Library2019Yang-Turner, FVolk, DFowler, PSwann, JBull, MHoosdally, SConnor, TPeto, TCrook, DPathogen genomic data analysis can be extremely bespoke and diverse. This paper presents our plan and progress towards creating a Scalable Pathogen Pipeline Platform (SP3) providing an efficient and unified process of collecting, analysing and comparing genomic data analysis with the benefit of elastic cloud computing. SP3 enables container-centric bioinformatic workflows run on personal computers, High-performance computing (HPC) clusters and cloud platforms. We have deployed and tested SP3 on local HPC, Google Cloud Platform (GCP), Microsoft Azure and OpenStack Platforms. SP3 allows users to fetch genomic sequencing data from European Nucleotide Archive (ENA) and conduct analysis with open-source bioinformatic pipelines. We believe SP3 will promote common standards around pathogen genomic data quality, data processing and data analysis, helping answer the challenges of tools divergence and leveraging a pool of public genomic data repository and cloud resources.
spellingShingle Yang-Turner, F
Volk, D
Fowler, P
Swann, J
Bull, M
Hoosdally, S
Connor, T
Peto, T
Crook, D
Scalable Pathogen Pipeline Platform (SP^3): enabling unified genomic data analysis with elastic cloud computing
title Scalable Pathogen Pipeline Platform (SP^3): enabling unified genomic data analysis with elastic cloud computing
title_full Scalable Pathogen Pipeline Platform (SP^3): enabling unified genomic data analysis with elastic cloud computing
title_fullStr Scalable Pathogen Pipeline Platform (SP^3): enabling unified genomic data analysis with elastic cloud computing
title_full_unstemmed Scalable Pathogen Pipeline Platform (SP^3): enabling unified genomic data analysis with elastic cloud computing
title_short Scalable Pathogen Pipeline Platform (SP^3): enabling unified genomic data analysis with elastic cloud computing
title_sort scalable pathogen pipeline platform sp 3 enabling unified genomic data analysis with elastic cloud computing
work_keys_str_mv AT yangturnerf scalablepathogenpipelineplatformsp3enablingunifiedgenomicdataanalysiswithelasticcloudcomputing
AT volkd scalablepathogenpipelineplatformsp3enablingunifiedgenomicdataanalysiswithelasticcloudcomputing
AT fowlerp scalablepathogenpipelineplatformsp3enablingunifiedgenomicdataanalysiswithelasticcloudcomputing
AT swannj scalablepathogenpipelineplatformsp3enablingunifiedgenomicdataanalysiswithelasticcloudcomputing
AT bullm scalablepathogenpipelineplatformsp3enablingunifiedgenomicdataanalysiswithelasticcloudcomputing
AT hoosdallys scalablepathogenpipelineplatformsp3enablingunifiedgenomicdataanalysiswithelasticcloudcomputing
AT connort scalablepathogenpipelineplatformsp3enablingunifiedgenomicdataanalysiswithelasticcloudcomputing
AT petot scalablepathogenpipelineplatformsp3enablingunifiedgenomicdataanalysiswithelasticcloudcomputing
AT crookd scalablepathogenpipelineplatformsp3enablingunifiedgenomicdataanalysiswithelasticcloudcomputing