Evaluating the Performance of the Generalized Linear Model (glm) R Package Using Single-Cell RNA-Sequencing Data
The glm R package is commonly used for generalized linear modeling. In this paper, we evaluate the ability of the glm package to predict binomial outcomes using logistic regression. We use single-cell RNA-sequencing datasets, after a series of normalization, to fit data into glm models repeatedly us...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-10-01
|
Series: | Applied Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/2076-3417/13/20/11512 |
_version_ | 1827721872895115264 |
---|---|
author | Omar Alaqeeli Raad Alturki |
author_facet | Omar Alaqeeli Raad Alturki |
author_sort | Omar Alaqeeli |
collection | DOAJ |
description | The glm R package is commonly used for generalized linear modeling. In this paper, we evaluate the ability of the glm package to predict binomial outcomes using logistic regression. We use single-cell RNA-sequencing datasets, after a series of normalization, to fit data into glm models repeatedly using 10-fold cross-validation over 100 iterations. Our evaluation criteria are glm’s Precision, Recall, F1-Score, Area Under the Curve (AUC), and Runtime. Scores for each evaluation category are collected, and their medians are calculated. Our findings show that glm has fluctuating Precision and F1-Scores. In terms of Recall, glm has shown more stable performance, while in the AUC category, glm shows remarkable performance. Also, the Runtime of glm is consistent. Our findings also show that there are no correlations between the size of fitted data and glm’s Precision, Recall, F1-Score, and AUC, except for Runtime. |
first_indexed | 2024-03-10T21:28:47Z |
format | Article |
id | doaj.art-d64e02f50b2b4b399d8943fd1b4f91f7 |
institution | Directory Open Access Journal |
issn | 2076-3417 |
language | English |
last_indexed | 2024-03-10T21:28:47Z |
publishDate | 2023-10-01 |
publisher | MDPI AG |
record_format | Article |
series | Applied Sciences |
spelling | doaj.art-d64e02f50b2b4b399d8943fd1b4f91f72023-11-19T15:33:12ZengMDPI AGApplied Sciences2076-34172023-10-0113201151210.3390/app132011512Evaluating the Performance of the Generalized Linear Model (glm) R Package Using Single-Cell RNA-Sequencing DataOmar Alaqeeli0Raad Alturki1Department of Computer Science, Saudi Electronic University, Riyadh 11673, Saudi ArabiaDepartment of Computer Science, Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh 11432, Saudi ArabiaThe glm R package is commonly used for generalized linear modeling. In this paper, we evaluate the ability of the glm package to predict binomial outcomes using logistic regression. We use single-cell RNA-sequencing datasets, after a series of normalization, to fit data into glm models repeatedly using 10-fold cross-validation over 100 iterations. Our evaluation criteria are glm’s Precision, Recall, F1-Score, Area Under the Curve (AUC), and Runtime. Scores for each evaluation category are collected, and their medians are calculated. Our findings show that glm has fluctuating Precision and F1-Scores. In terms of Recall, glm has shown more stable performance, while in the AUC category, glm shows remarkable performance. Also, the Runtime of glm is consistent. Our findings also show that there are no correlations between the size of fitted data and glm’s Precision, Recall, F1-Score, and AUC, except for Runtime.https://www.mdpi.com/2076-3417/13/20/11512generalized linear modelclassificationsingle-cell RNA-sequencingprecisionrecallF1-Score |
spellingShingle | Omar Alaqeeli Raad Alturki Evaluating the Performance of the Generalized Linear Model (glm) R Package Using Single-Cell RNA-Sequencing Data Applied Sciences generalized linear model classification single-cell RNA-sequencing precision recall F1-Score |
title | Evaluating the Performance of the Generalized Linear Model (glm) R Package Using Single-Cell RNA-Sequencing Data |
title_full | Evaluating the Performance of the Generalized Linear Model (glm) R Package Using Single-Cell RNA-Sequencing Data |
title_fullStr | Evaluating the Performance of the Generalized Linear Model (glm) R Package Using Single-Cell RNA-Sequencing Data |
title_full_unstemmed | Evaluating the Performance of the Generalized Linear Model (glm) R Package Using Single-Cell RNA-Sequencing Data |
title_short | Evaluating the Performance of the Generalized Linear Model (glm) R Package Using Single-Cell RNA-Sequencing Data |
title_sort | evaluating the performance of the generalized linear model glm r package using single cell rna sequencing data |
topic | generalized linear model classification single-cell RNA-sequencing precision recall F1-Score |
url | https://www.mdpi.com/2076-3417/13/20/11512 |
work_keys_str_mv | AT omaralaqeeli evaluatingtheperformanceofthegeneralizedlinearmodelglmrpackageusingsinglecellrnasequencingdata AT raadalturki evaluatingtheperformanceofthegeneralizedlinearmodelglmrpackageusingsinglecellrnasequencingdata |