The impact of Docker containers on the performance of genomic pipelines

Genomic pipelines consist of several pieces of third party software and, because of their experimental nature, frequent changes and updates are commonly necessary thus raising serious deployment and reproducibility issues. Docker containers are emerging as a possible solution for many of these probl...

Full description

Bibliographic Details
Main Authors: Paolo Di Tommaso, Emilio Palumbo, Maria Chatzou, Pablo Prieto, Michael L. Heuer, Cedric Notredame
Format: Article
Language:English
Published: PeerJ Inc. 2015-09-01
Series:PeerJ
Subjects:
Online Access:https://peerj.com/articles/1273.pdf
_version_ 1797424300949504000
author Paolo Di Tommaso
Emilio Palumbo
Maria Chatzou
Pablo Prieto
Michael L. Heuer
Cedric Notredame
author_facet Paolo Di Tommaso
Emilio Palumbo
Maria Chatzou
Pablo Prieto
Michael L. Heuer
Cedric Notredame
author_sort Paolo Di Tommaso
collection DOAJ
description Genomic pipelines consist of several pieces of third party software and, because of their experimental nature, frequent changes and updates are commonly necessary thus raising serious deployment and reproducibility issues. Docker containers are emerging as a possible solution for many of these problems, as they allow the packaging of pipelines in an isolated and self-contained manner. This makes it easy to distribute and execute pipelines in a portable manner across a wide range of computing platforms. Thus, the question that arises is to what extent the use of Docker containers might affect the performance of these pipelines. Here we address this question and conclude that Docker containers have only a minor impact on the performance of common genomic pipelines, which is negligible when the executed jobs are long in terms of computational time.
first_indexed 2024-03-09T08:00:14Z
format Article
id doaj.art-0e8cf15c6a164b10b92cabc7f498a043
institution Directory Open Access Journal
issn 2167-8359
language English
last_indexed 2024-03-09T08:00:14Z
publishDate 2015-09-01
publisher PeerJ Inc.
record_format Article
series PeerJ
spelling doaj.art-0e8cf15c6a164b10b92cabc7f498a0432023-12-03T00:46:56ZengPeerJ Inc.PeerJ2167-83592015-09-013e127310.7717/peerj.1273The impact of Docker containers on the performance of genomic pipelinesPaolo Di Tommaso0Emilio Palumbo1Maria Chatzou2Pablo Prieto3Michael L. Heuer4Cedric Notredame5Bioinformatics and Genomics Program, Centre for Genomic Regulation (CRG), Barcelona, SpainBioinformatics and Genomics Program, Centre for Genomic Regulation (CRG), Barcelona, SpainBioinformatics and Genomics Program, Centre for Genomic Regulation (CRG), Barcelona, SpainBioinformatics and Genomics Program, Centre for Genomic Regulation (CRG), Barcelona, SpainDepartment of Bioinformatics Research, National Marrow Donor Program, Minneapolis, MN, United StatesBioinformatics and Genomics Program, Centre for Genomic Regulation (CRG), Barcelona, SpainGenomic pipelines consist of several pieces of third party software and, because of their experimental nature, frequent changes and updates are commonly necessary thus raising serious deployment and reproducibility issues. Docker containers are emerging as a possible solution for many of these problems, as they allow the packaging of pipelines in an isolated and self-contained manner. This makes it easy to distribute and execute pipelines in a portable manner across a wide range of computing platforms. Thus, the question that arises is to what extent the use of Docker containers might affect the performance of these pipelines. Here we address this question and conclude that Docker containers have only a minor impact on the performance of common genomic pipelines, which is negligible when the executed jobs are long in terms of computational time.https://peerj.com/articles/1273.pdfWorkflowPipelinesDockerVirtualisationBioinformatics
spellingShingle Paolo Di Tommaso
Emilio Palumbo
Maria Chatzou
Pablo Prieto
Michael L. Heuer
Cedric Notredame
The impact of Docker containers on the performance of genomic pipelines
PeerJ
Workflow
Pipelines
Docker
Virtualisation
Bioinformatics
title The impact of Docker containers on the performance of genomic pipelines
title_full The impact of Docker containers on the performance of genomic pipelines
title_fullStr The impact of Docker containers on the performance of genomic pipelines
title_full_unstemmed The impact of Docker containers on the performance of genomic pipelines
title_short The impact of Docker containers on the performance of genomic pipelines
title_sort impact of docker containers on the performance of genomic pipelines
topic Workflow
Pipelines
Docker
Virtualisation
Bioinformatics
url https://peerj.com/articles/1273.pdf
work_keys_str_mv AT paoloditommaso theimpactofdockercontainersontheperformanceofgenomicpipelines
AT emiliopalumbo theimpactofdockercontainersontheperformanceofgenomicpipelines
AT mariachatzou theimpactofdockercontainersontheperformanceofgenomicpipelines
AT pabloprieto theimpactofdockercontainersontheperformanceofgenomicpipelines
AT michaellheuer theimpactofdockercontainersontheperformanceofgenomicpipelines
AT cedricnotredame theimpactofdockercontainersontheperformanceofgenomicpipelines
AT paoloditommaso impactofdockercontainersontheperformanceofgenomicpipelines
AT emiliopalumbo impactofdockercontainersontheperformanceofgenomicpipelines
AT mariachatzou impactofdockercontainersontheperformanceofgenomicpipelines
AT pabloprieto impactofdockercontainersontheperformanceofgenomicpipelines
AT michaellheuer impactofdockercontainersontheperformanceofgenomicpipelines
AT cedricnotredame impactofdockercontainersontheperformanceofgenomicpipelines