rworkflows: automating reproducible practices for the R community

Abstract Despite calls to improve reproducibility in research, achieving this goal remains elusive even within computational fields. Currently, >50% of R packages are distributed exclusively through GitHub. While the trend towards sharing open-source software has been revolutionary, GitHub does n...

Full description

Bibliographic Details
Main Authors: Brian M. Schilder, Alan E. Murphy, Nathan G. Skene
Format: Article
Language:English
Published: Nature Portfolio 2024-01-01
Series:Nature Communications
Online Access:https://doi.org/10.1038/s41467-023-44484-5
_version_ 1797363153031397376
author Brian M. Schilder
Alan E. Murphy
Nathan G. Skene
author_facet Brian M. Schilder
Alan E. Murphy
Nathan G. Skene
author_sort Brian M. Schilder
collection DOAJ
description Abstract Despite calls to improve reproducibility in research, achieving this goal remains elusive even within computational fields. Currently, >50% of R packages are distributed exclusively through GitHub. While the trend towards sharing open-source software has been revolutionary, GitHub does not have any default built-in checks for minimal coding standards or software usability. This makes it difficult to assess the current quality R packages, or to consistently use them over time and across platforms. While GitHub-native solutions are technically possible, they require considerable time and expertise for each developer to write, implement, and maintain. To address this, we develop rworkflows; a suite of tools to make robust continuous integration and deployment ( https://github.com/neurogenomics/rworkflows ). rworkflows can be implemented by developers of all skill levels using a one-time R function call which has both sensible defaults and extensive options for customisation. Once implemented, any updates to the GitHub repository automatically trigger parallel workflows that install all software dependencies, run code checks, generate a dedicated documentation website, and deploy a publicly accessible containerised environment. By making the rworkflows suite free, automated, and simple to use, we aim to promote widespread adoption of reproducible practices across a continually growing R community.
first_indexed 2024-03-08T16:16:33Z
format Article
id doaj.art-a11a3cccc3cb411ab4f7c0f7d3122975
institution Directory Open Access Journal
issn 2041-1723
language English
last_indexed 2024-03-08T16:16:33Z
publishDate 2024-01-01
publisher Nature Portfolio
record_format Article
series Nature Communications
spelling doaj.art-a11a3cccc3cb411ab4f7c0f7d31229752024-01-07T12:33:10ZengNature PortfolioNature Communications2041-17232024-01-0115111010.1038/s41467-023-44484-5rworkflows: automating reproducible practices for the R communityBrian M. Schilder0Alan E. Murphy1Nathan G. Skene2Department of Brain Sciences, Faculty of Medicine, Imperial College LondonDepartment of Brain Sciences, Faculty of Medicine, Imperial College LondonDepartment of Brain Sciences, Faculty of Medicine, Imperial College LondonAbstract Despite calls to improve reproducibility in research, achieving this goal remains elusive even within computational fields. Currently, >50% of R packages are distributed exclusively through GitHub. While the trend towards sharing open-source software has been revolutionary, GitHub does not have any default built-in checks for minimal coding standards or software usability. This makes it difficult to assess the current quality R packages, or to consistently use them over time and across platforms. While GitHub-native solutions are technically possible, they require considerable time and expertise for each developer to write, implement, and maintain. To address this, we develop rworkflows; a suite of tools to make robust continuous integration and deployment ( https://github.com/neurogenomics/rworkflows ). rworkflows can be implemented by developers of all skill levels using a one-time R function call which has both sensible defaults and extensive options for customisation. Once implemented, any updates to the GitHub repository automatically trigger parallel workflows that install all software dependencies, run code checks, generate a dedicated documentation website, and deploy a publicly accessible containerised environment. By making the rworkflows suite free, automated, and simple to use, we aim to promote widespread adoption of reproducible practices across a continually growing R community.https://doi.org/10.1038/s41467-023-44484-5
spellingShingle Brian M. Schilder
Alan E. Murphy
Nathan G. Skene
rworkflows: automating reproducible practices for the R community
Nature Communications
title rworkflows: automating reproducible practices for the R community
title_full rworkflows: automating reproducible practices for the R community
title_fullStr rworkflows: automating reproducible practices for the R community
title_full_unstemmed rworkflows: automating reproducible practices for the R community
title_short rworkflows: automating reproducible practices for the R community
title_sort rworkflows automating reproducible practices for the r community
url https://doi.org/10.1038/s41467-023-44484-5
work_keys_str_mv AT brianmschilder rworkflowsautomatingreproduciblepracticesforthercommunity
AT alanemurphy rworkflowsautomatingreproduciblepracticesforthercommunity
AT nathangskene rworkflowsautomatingreproduciblepracticesforthercommunity