Optimizing UniFrac with OpenACC Yields Greater Than One Thousand Times Speed Increase
ABSTRACT UniFrac is an important tool in microbiome research that is used for phylogenetically comparing microbiome profiles to one another (beta diversity). Striped UniFrac recently added the ability to split the problem into many independent subproblems, exhibiting nearly linear scaling but suffer...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
American Society for Microbiology
2022-06-01
|
Series: | mSystems |
Subjects: | |
Online Access: | https://journals.asm.org/doi/10.1128/msystems.00028-22 |
_version_ | 1811342733965524992 |
---|---|
author | Igor Sfiligoi George Armstrong Antonio Gonzalez Daniel McDonald Rob Knight |
author_facet | Igor Sfiligoi George Armstrong Antonio Gonzalez Daniel McDonald Rob Knight |
author_sort | Igor Sfiligoi |
collection | DOAJ |
description | ABSTRACT UniFrac is an important tool in microbiome research that is used for phylogenetically comparing microbiome profiles to one another (beta diversity). Striped UniFrac recently added the ability to split the problem into many independent subproblems, exhibiting nearly linear scaling but suffering from memory contention. Here, we adapt UniFrac to graphics processing units using OpenACC, enabling greater than 1,000× computational improvement, and apply it to 307,237 samples, the largest 16S rRNA V4 uniformly preprocessed microbiome data set analyzed to date. IMPORTANCE UniFrac is an important tool in microbiome research that is used for phylogenetically comparing microbiome profiles to one another. Here, we adapt UniFrac to operate on graphics processing units, enabling a 1,000× computational improvement. To highlight this advance, we perform what may be the largest microbiome analysis to date, applying UniFrac to 307,237 16S rRNA V4 microbiome samples preprocessed with Deblur. These scaling improvements turn UniFrac into a real-time tool for common data sets and unlock new research questions as more microbiome data are collected. |
first_indexed | 2024-04-13T19:15:56Z |
format | Article |
id | doaj.art-c324a129963e4ad5b8e43f5f10c0b6ce |
institution | Directory Open Access Journal |
issn | 2379-5077 |
language | English |
last_indexed | 2024-04-13T19:15:56Z |
publishDate | 2022-06-01 |
publisher | American Society for Microbiology |
record_format | Article |
series | mSystems |
spelling | doaj.art-c324a129963e4ad5b8e43f5f10c0b6ce2022-12-22T02:33:41ZengAmerican Society for MicrobiologymSystems2379-50772022-06-017310.1128/msystems.00028-22Optimizing UniFrac with OpenACC Yields Greater Than One Thousand Times Speed IncreaseIgor Sfiligoi0George Armstrong1Antonio Gonzalez2Daniel McDonald3Rob Knight4San Diego Supercomputing Center, University of California, San Diego, La Jolla, California, USABioinformatics and Systems Biology Program, University of California, San Diego, La Jolla, California, USADepartment of Pediatrics, University of California, San Diego, La Jolla, California, USADepartment of Pediatrics, University of California, San Diego, La Jolla, California, USADepartment of Pediatrics, University of California, San Diego, La Jolla, California, USAABSTRACT UniFrac is an important tool in microbiome research that is used for phylogenetically comparing microbiome profiles to one another (beta diversity). Striped UniFrac recently added the ability to split the problem into many independent subproblems, exhibiting nearly linear scaling but suffering from memory contention. Here, we adapt UniFrac to graphics processing units using OpenACC, enabling greater than 1,000× computational improvement, and apply it to 307,237 samples, the largest 16S rRNA V4 uniformly preprocessed microbiome data set analyzed to date. IMPORTANCE UniFrac is an important tool in microbiome research that is used for phylogenetically comparing microbiome profiles to one another. Here, we adapt UniFrac to operate on graphics processing units, enabling a 1,000× computational improvement. To highlight this advance, we perform what may be the largest microbiome analysis to date, applying UniFrac to 307,237 16S rRNA V4 microbiome samples preprocessed with Deblur. These scaling improvements turn UniFrac into a real-time tool for common data sets and unlock new research questions as more microbiome data are collected.https://journals.asm.org/doi/10.1128/msystems.00028-22microbiomeGPUOpenACCoptimizationUniFrac |
spellingShingle | Igor Sfiligoi George Armstrong Antonio Gonzalez Daniel McDonald Rob Knight Optimizing UniFrac with OpenACC Yields Greater Than One Thousand Times Speed Increase mSystems microbiome GPU OpenACC optimization UniFrac |
title | Optimizing UniFrac with OpenACC Yields Greater Than One Thousand Times Speed Increase |
title_full | Optimizing UniFrac with OpenACC Yields Greater Than One Thousand Times Speed Increase |
title_fullStr | Optimizing UniFrac with OpenACC Yields Greater Than One Thousand Times Speed Increase |
title_full_unstemmed | Optimizing UniFrac with OpenACC Yields Greater Than One Thousand Times Speed Increase |
title_short | Optimizing UniFrac with OpenACC Yields Greater Than One Thousand Times Speed Increase |
title_sort | optimizing unifrac with openacc yields greater than one thousand times speed increase |
topic | microbiome GPU OpenACC optimization UniFrac |
url | https://journals.asm.org/doi/10.1128/msystems.00028-22 |
work_keys_str_mv | AT igorsfiligoi optimizingunifracwithopenaccyieldsgreaterthanonethousandtimesspeedincrease AT georgearmstrong optimizingunifracwithopenaccyieldsgreaterthanonethousandtimesspeedincrease AT antoniogonzalez optimizingunifracwithopenaccyieldsgreaterthanonethousandtimesspeedincrease AT danielmcdonald optimizingunifracwithopenaccyieldsgreaterthanonethousandtimesspeedincrease AT robknight optimizingunifracwithopenaccyieldsgreaterthanonethousandtimesspeedincrease |