State of the Art in Parallel Computing with R

R is a mature open-source programming language for statistical computing and graphics. Many areas of statistical research are experiencing rapid growth in the size of data sets. Methodological advances drive increased use of simulations. A common approach is to use parallel computing.This paper pres...

Full description

Bibliographic Details
Main Authors: Markus Schmidberger, Luke Tierney, Dirk Eddelbuettel, Hao Yu, Ulrich Mansmann, Martin Morgan
Format: Article
Language:English
Published: Foundation for Open Access Statistics 2009-06-01
Series:Journal of Statistical Software
Subjects:
Online Access:http://www.jstatsoft.org/v31/i01/paper
_version_ 1811282967130013696
author Markus Schmidberger
Luke Tierney
Dirk Eddelbuettel
Hao Yu
Ulrich Mansmann
Martin Morgan
author_facet Markus Schmidberger
Luke Tierney
Dirk Eddelbuettel
Hao Yu
Ulrich Mansmann
Martin Morgan
author_sort Markus Schmidberger
collection DOAJ
description R is a mature open-source programming language for statistical computing and graphics. Many areas of statistical research are experiencing rapid growth in the size of data sets. Methodological advances drive increased use of simulations. A common approach is to use parallel computing.This paper presents an overview of techniques for parallel computing with R on computer clusters, on multi-core systems, and in grid computing. It reviews sixteen different packages, comparing them on their state of development, the parallel technology used, as well as on usability, acceptance, and performance.Two packages (snow, Rmpi) stand out as particularly suited to general use on computer clusters. Packages for grid computing are still in development, with only one package currently available to the end user. For multi-core systems five different packages exist, but a number of issues pose challenges to early adopters. The paper concludes with ideas for further developments in high performance computing with R. Example code is available in the appendix.
first_indexed 2024-04-13T02:03:08Z
format Article
id doaj.art-beb05f9c04264466a9df48d2c9e50ac3
institution Directory Open Access Journal
issn 1548-7660
language English
last_indexed 2024-04-13T02:03:08Z
publishDate 2009-06-01
publisher Foundation for Open Access Statistics
record_format Article
series Journal of Statistical Software
spelling doaj.art-beb05f9c04264466a9df48d2c9e50ac32022-12-22T03:07:34ZengFoundation for Open Access StatisticsJournal of Statistical Software1548-76602009-06-013101State of the Art in Parallel Computing with RMarkus SchmidbergerLuke TierneyDirk EddelbuettelHao YuUlrich MansmannMartin MorganR is a mature open-source programming language for statistical computing and graphics. Many areas of statistical research are experiencing rapid growth in the size of data sets. Methodological advances drive increased use of simulations. A common approach is to use parallel computing.This paper presents an overview of techniques for parallel computing with R on computer clusters, on multi-core systems, and in grid computing. It reviews sixteen different packages, comparing them on their state of development, the parallel technology used, as well as on usability, acceptance, and performance.Two packages (snow, Rmpi) stand out as particularly suited to general use on computer clusters. Packages for grid computing are still in development, with only one package currently available to the end user. For multi-core systems five different packages exist, but a number of issues pose challenges to early adopters. The paper concludes with ideas for further developments in high performance computing with R. Example code is available in the appendix.http://www.jstatsoft.org/v31/i01/paperRhigh performance computingparallel computingcomputer clustermulti-core systemsgrid computingbenchmark
spellingShingle Markus Schmidberger
Luke Tierney
Dirk Eddelbuettel
Hao Yu
Ulrich Mansmann
Martin Morgan
State of the Art in Parallel Computing with R
Journal of Statistical Software
R
high performance computing
parallel computing
computer cluster
multi-core systems
grid computing
benchmark
title State of the Art in Parallel Computing with R
title_full State of the Art in Parallel Computing with R
title_fullStr State of the Art in Parallel Computing with R
title_full_unstemmed State of the Art in Parallel Computing with R
title_short State of the Art in Parallel Computing with R
title_sort state of the art in parallel computing with r
topic R
high performance computing
parallel computing
computer cluster
multi-core systems
grid computing
benchmark
url http://www.jstatsoft.org/v31/i01/paper
work_keys_str_mv AT markusschmidberger stateoftheartinparallelcomputingwithr
AT luketierney stateoftheartinparallelcomputingwithr
AT dirkeddelbuettel stateoftheartinparallelcomputingwithr
AT haoyu stateoftheartinparallelcomputingwithr
AT ulrichmansmann stateoftheartinparallelcomputingwithr
AT martinmorgan stateoftheartinparallelcomputingwithr