NEBULA is a fast negative binomial mixed model for differential or co-expression analysis of large-scale multi-subject single-cell data
<jats:title>Abstract</jats:title><jats:p>The increasing availability of single-cell data revolutionizes the understanding of biological mechanisms at cellular resolution. For differential expression analysis in multi-subject single-cell data, negative binomial mixed models account...
Main Authors: | , , , , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
Springer Science and Business Media LLC
2022
|
Online Access: | https://hdl.handle.net/1721.1/143719 |
_version_ | 1826213666588983296 |
---|---|
author | He, Liang Davila-Velderrain, Jose Sumida, Tomokazu S Hafler, David A Kellis, Manolis Kulminski, Alexander M |
author2 | Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory |
author_facet | Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory He, Liang Davila-Velderrain, Jose Sumida, Tomokazu S Hafler, David A Kellis, Manolis Kulminski, Alexander M |
author_sort | He, Liang |
collection | MIT |
description | <jats:title>Abstract</jats:title><jats:p>The increasing availability of single-cell data revolutionizes the understanding of biological mechanisms at cellular resolution. For differential expression analysis in multi-subject single-cell data, negative binomial mixed models account for both subject-level and cell-level overdispersions, but are computationally demanding. Here, we propose an efficient NEgative Binomial mixed model Using a Large-sample Approximation (NEBULA). The speed gain is achieved by analytically solving high-dimensional integrals instead of using the Laplace approximation. We demonstrate that NEBULA is orders of magnitude faster than existing tools and controls false-positive errors in marker gene identification and co-expression analysis. Using NEBULA in Alzheimer’s disease cohort data sets, we found that the cell-level expression of <jats:italic>APOE</jats:italic> correlated with that of other genetic risk factors (including <jats:italic>CLU, CST3, TREM2</jats:italic>, C1q, and <jats:italic>ITM2B</jats:italic>) in a cell-type-specific pattern and an isoform-dependent manner in microglia. NEBULA opens up a new avenue for the broad application of mixed models to large-scale multi-subject single-cell data.</jats:p> |
first_indexed | 2024-09-23T15:52:54Z |
format | Article |
id | mit-1721.1/143719 |
institution | Massachusetts Institute of Technology |
language | English |
last_indexed | 2024-09-23T15:52:54Z |
publishDate | 2022 |
publisher | Springer Science and Business Media LLC |
record_format | dspace |
spelling | mit-1721.1/1437192023-07-07T20:32:38Z NEBULA is a fast negative binomial mixed model for differential or co-expression analysis of large-scale multi-subject single-cell data He, Liang Davila-Velderrain, Jose Sumida, Tomokazu S Hafler, David A Kellis, Manolis Kulminski, Alexander M Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory <jats:title>Abstract</jats:title><jats:p>The increasing availability of single-cell data revolutionizes the understanding of biological mechanisms at cellular resolution. For differential expression analysis in multi-subject single-cell data, negative binomial mixed models account for both subject-level and cell-level overdispersions, but are computationally demanding. Here, we propose an efficient NEgative Binomial mixed model Using a Large-sample Approximation (NEBULA). The speed gain is achieved by analytically solving high-dimensional integrals instead of using the Laplace approximation. We demonstrate that NEBULA is orders of magnitude faster than existing tools and controls false-positive errors in marker gene identification and co-expression analysis. Using NEBULA in Alzheimer’s disease cohort data sets, we found that the cell-level expression of <jats:italic>APOE</jats:italic> correlated with that of other genetic risk factors (including <jats:italic>CLU, CST3, TREM2</jats:italic>, C1q, and <jats:italic>ITM2B</jats:italic>) in a cell-type-specific pattern and an isoform-dependent manner in microglia. NEBULA opens up a new avenue for the broad application of mixed models to large-scale multi-subject single-cell data.</jats:p> 2022-07-13T17:06:57Z 2022-07-13T17:06:57Z 2021 2022-07-13T16:49:30Z Article http://purl.org/eprint/type/JournalArticle https://hdl.handle.net/1721.1/143719 He, Liang, Davila-Velderrain, Jose, Sumida, Tomokazu S, Hafler, David A, Kellis, Manolis et al. 2021. "NEBULA is a fast negative binomial mixed model for differential or co-expression analysis of large-scale multi-subject single-cell data." Communications Biology, 4 (1). en 10.1038/S42003-021-02146-6 Communications Biology Creative Commons Attribution 4.0 International license https://creativecommons.org/licenses/by/4.0/ application/pdf Springer Science and Business Media LLC Nature |
spellingShingle | He, Liang Davila-Velderrain, Jose Sumida, Tomokazu S Hafler, David A Kellis, Manolis Kulminski, Alexander M NEBULA is a fast negative binomial mixed model for differential or co-expression analysis of large-scale multi-subject single-cell data |
title | NEBULA is a fast negative binomial mixed model for differential or co-expression analysis of large-scale multi-subject single-cell data |
title_full | NEBULA is a fast negative binomial mixed model for differential or co-expression analysis of large-scale multi-subject single-cell data |
title_fullStr | NEBULA is a fast negative binomial mixed model for differential or co-expression analysis of large-scale multi-subject single-cell data |
title_full_unstemmed | NEBULA is a fast negative binomial mixed model for differential or co-expression analysis of large-scale multi-subject single-cell data |
title_short | NEBULA is a fast negative binomial mixed model for differential or co-expression analysis of large-scale multi-subject single-cell data |
title_sort | nebula is a fast negative binomial mixed model for differential or co expression analysis of large scale multi subject single cell data |
url | https://hdl.handle.net/1721.1/143719 |
work_keys_str_mv | AT heliang nebulaisafastnegativebinomialmixedmodelfordifferentialorcoexpressionanalysisoflargescalemultisubjectsinglecelldata AT davilavelderrainjose nebulaisafastnegativebinomialmixedmodelfordifferentialorcoexpressionanalysisoflargescalemultisubjectsinglecelldata AT sumidatomokazus nebulaisafastnegativebinomialmixedmodelfordifferentialorcoexpressionanalysisoflargescalemultisubjectsinglecelldata AT haflerdavida nebulaisafastnegativebinomialmixedmodelfordifferentialorcoexpressionanalysisoflargescalemultisubjectsinglecelldata AT kellismanolis nebulaisafastnegativebinomialmixedmodelfordifferentialorcoexpressionanalysisoflargescalemultisubjectsinglecelldata AT kulminskialexanderm nebulaisafastnegativebinomialmixedmodelfordifferentialorcoexpressionanalysisoflargescalemultisubjectsinglecelldata |